About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
indexing:full_text_tools

This is an old revision of the document!


Full Text Extraction Tools

VuFind's import tools include support for using external software to extract full text from external documents (PDF, Word, etc.). In order to take advantage of this, you need to install an appropriate tool (see options below) and point VuFind to the software by editing the fulltext.ini file (in web/conf for VuFind 1.x or config/vufind within your local settings directory for VuFind 2.x).

For more details of using full text in VuFind, see the import instructions for MARC and XML documents.

Aperture

Aperture was the first tool that VuFind supported for full-text extraction, and it is the only option for use with versions 1.3 and earlier. Unfortunately, Aperture is no longer in active development, so users with VuFind 1.4 or later are encouraged to use Tika instead (see below).

Downloading Aperture

Troubleshooting Aperture

Under Linux, there is a known bug with version 1.5.0 which prevents Aperture from running on the command line correctly. See this page for details on the fix (it just involves minor edits to the lcp.sh file).

Tika

Tika is supported by VuFind 1.4 and later, and is the recommended full-text extraction tool when supported.

Downloading Tika

indexing/full_text_tools.1450124518.txt.gz · Last modified: 2015/12/14 20:21 by demiankatz