Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
pdf_fulltext_indexing [2009/06/11 11:30] rmzellepdf_fulltext_indexing [2018/04/29 19:11] (current) dstillman
Line 1: Line 1:
-====== PDF Fulltext Indexing ======+====== PDF Full-Text Indexing ======
  
-Zotero supports fulltext indexing of PDF documents. PDF fulltext indexing currently requires an external program to convert embedded text in PDF files into plaintext cache files. By default, Zotero uses the pdftotext utility from the [[http://www.foolabs.com/xpdf/|Xpdf]] project, which is an open-source, cross-platform PDF viewerAdditional functionality is available through another Xpdf utility, pdfinfo. +Zotero uses tools from the [[https://www.xpdfreader.com/|Xpdf project]] to extract full-text content from PDFs for searchingSince Zotero 5.0.36, the PDF tools are bundled with Zotero and do not need to be downloaded separately as in previous versions.
- +
-====== Basic Installation ====== +
- +
-Customized, platform-specific versions of pdftotext and pdfinfo can be downloaded and installed automatically through the Zotero preferences. +
- +
-After installing the toolsnew snapshots should automatically be indexed when added to the Library. Existing attachments can be indexed via the Zotero prefs. +
- +
-Note that PDF fulltext indexing will not work with files that contain only images, though some image-based PDFs also include a hidden layer of searchable text.((As of March 28, 2007, [[http://www.jstor.org|JSTOR]] is [[http://www.jstor.org/about/newfeatures.html|including an embedded text layer in its PDFs]].)) +
- +
- +
-====== Advanced Installation ====== +
- +
-Zotero requires modified binaries of pdftotext and pdfinfo on Windows (to prevent the command-prompt window from appearing at indexing time) and a custom build of pdfinfo on all platforms that supports writing to a text file   ([[https://www.zotero.org/trac/browser/tools/xpdf/pdfinfo.cc|source code]] available). +
- +
-Users wishing to install the Xpdf tools manually (or on platforms for which we haven't built customized binaries) can do so by building the tools and either placing the binaries directly in the Zotero data directory or linking to them from there. Either way, a platform-specific file must be created in the Zotero data directory, conforming to the format "pdftotext-''{platform}''", where ''{platform}'' is "Win32", "MacIntel", "MacPPC", "Linux-i686", etc. (To determine your current platform, type ''javascript:alert(navigator.platform)'' in the Firefox URL bar and hit Enter.)  The Windows version requires the .exe extension, i.e. "pdftotext-Win32.exe". A text file containing the installed version number can also be created in the format pdftotext-{platform}.version.+
pdf_fulltext_indexing.1244734223.txt.gz · Last modified: 2009/06/11 11:30 by rmzelle