About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
indexing:full_text_tools

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revisionBoth sides next revision
aperture [2011/06/17 15:18] – created demiankatzindexing:full_text_tools [2018/12/19 18:31] demiankatz
Line 1: Line 1:
-====== Aperture ======+====== Full Text Extraction Tools ======
  
-VuFind's import tools include support for using the Aperture Java library to extract full text from external documents (PDF, Word, etc.).  In order to take advantage of this, you need to install Aperture (see link below) and point VuFind to the software by editing the [[https://vufind.svn.sourceforge.net/svnroot/vufind/trunk/web/conf/fulltext.ini|web/conf/fulltext.ini]] file.+VuFind's import tools include support for using external software to extract full text from external documents (PDF, Word, etc.).  In order to take advantage of this, you need to install an appropriate tool (see options below) and point VuFind to the software by editing the [[https://github.com/vufind-org/vufind/blob/master/config/vufind/fulltext.ini|fulltext.ini]] file within your [[configuration:local_settings_directory|local settings directory]].
  
-For more details of using Aperture in VuFind, see the import instructions for [[importing_records#indexing_full_text|MARC]] and [[importing_records#full_text|XML]] documents.+For more details of using full text in VuFind, see the import instructions for [[indexing:marc#indexing_full_text|MARC]] and [[indexing:xml#full_text|XML]] documents.
  
-===== Downloading Aperture =====+===== Aperture ===== 
 + 
 +:!: Aperture was the first tool that VuFind supported for full-text extraction, and it is the only option for use with versions 1.3 and earlier.  Unfortunately, Aperture is no longer in active development, so users with VuFind 1.4 or later are encouraged to use Tika instead (see below). 
 + 
 +==== Downloading Aperture ====
  
   * [[http://aperture.sourceforge.net/|Aperture official site]]   * [[http://aperture.sourceforge.net/|Aperture official site]]
  
-===== Troubleshooting Aperture =====+==== Troubleshooting Aperture ====
  
 Under Linux, there is a known bug with version 1.5.0 which prevents Aperture from running on the command line correctly.  See [[http://sourceforge.net/tracker/index.php?func=detail&aid=3094429&group_id=150969&atid=779500|this page]] for details on the fix (it just involves minor edits to the lcp.sh file). Under Linux, there is a known bug with version 1.5.0 which prevents Aperture from running on the command line correctly.  See [[http://sourceforge.net/tracker/index.php?func=detail&aid=3094429&group_id=150969&atid=779500|this page]] for details on the fix (it just involves minor edits to the lcp.sh file).
  
 +===== Tika =====
 +
 +Tika is supported by VuFind 1.4 and later, and is the recommended full-text extraction tool when supported.
 +
 +==== Downloading Tika ====
 +
 +  * [[http://tika.apache.org/|Tika official site]]
 ---- struct data ---- ---- struct data ----
 ---- ----
  
indexing/full_text_tools.txt · Last modified: 2024/03/13 11:58 by demiankatz