About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
indexing:xml

This is an old revision of the document!


XML Records

If the data you want to import is not available in MARC format, chances are that you can access it in some flavor of XML. Fortunately, loading XML into VuFind's index is straightforward if you are familiar with the XSLT language – you simply need to translate from the XML format you have available into Solr's XML Message Format, then post the result to the Solr server.

Importing with XSLT

IMPORTANT: The XSLT tool described in this section was added in VuFind 1.1. If you are using an earlier version, you will have to upgrade.

VuFind's XSLT tool is designed to make posting XSLT-transformed documents to the Solr index simple while offering flexibility for extending XSLT and applying local customizations.

The Basics

The XSLT tool is driven by a properties file which provides a few key pieces of information:

  • The name of the XSLT file to use.
  • The names of custom PHP functions and classes that will be called by the XSLT file.
  • Any custom values you want to pass in as parameters to the XSLT file (i.e. local institution names, ID prefixes, etc.)

You can see an example properties file here. The comments in this example file explain the available settings.

Once a properties file is set up, you can import an XML file by switching to the import subdirectory of your VuFind installation and typing:

php import-xsl.php myFile.xml mySettings.properties

(substituting the appropriate XML and properties files as needed).

Full Text

VuFind's XSLT tool includes support for extracting full text from external documents (PDF, Word, etc.). In order to take advantage of this, you need to install and configure a full-text extraction tool.

For an example of full text extraction in action in VuFind, see the full text settings near the bottom of the VuDL Sample XSLT File.

Batch Importing

If you need to load a number of XML files at once, you can load them into a subdirectory under the harvest subdirectory of your VuFind installation and use the batch-import-xsl.sh script to load them all. This is commonly used in combination with OAI-PMH harvesting (described here).

indexing/xml.1463100200.txt.gz · Last modified: 2016/05/13 00:43 by demiankatz