Opened 8 years ago
#1078 new enhancement
Index Word docs
| Reported by: | stakats | Owned by: | dstillman |
|---|---|---|---|
| Priority: | major | Milestone: | |
| Component: | uncategorized | Version: | 1.5 |
| Keywords: | Cc: |
Description
Per last year's dev list discussion, we can extract the text from Word docs by unzipping and parsing .docx files (Word 2007/8) and by using a modified antiword binary for older Word files. In the interest of simplicity, perhaps we only handle .docx at first and avoid a custom binary?
Note: See
TracTickets for help on using
tickets.