Table of Contents
If the data you want to import is not available in MARC format, chances are that you can access it in some flavor of XML. Fortunately, loading XML into VuFind's index is straightforward if you are familiar with the XSLT language – you simply need to translate from the XML format you have available into Solr's XML Message Format, then post the result to the Solr server.
Importing with XSLT
The XSLT tool described in this section was added in VuFind 1.1.
VuFind's XSLT tool is designed to make posting XSLT-transformed documents to the Solr index simple while offering flexibility for extending XSLT and applying local customizations.
The XSLT tool is driven by a properties file which provides a few key pieces of information:
- The name of the XSLT file to use.
- The names of custom PHP functions and classes that will be called by the XSLT file.
- Any custom values you want to pass in as parameters to the XSLT file (i.e. local institution names, ID prefixes, etc.)
You can see an example properties file here. The comments in this example file explain the available settings.
Once a properties file is set up, you can import an XML file by switching to the import subdirectory of your VuFind installation and typing:
php import-xsl.php myFile.xml mySettings.properties
(substituting the appropriate XML and properties files as needed).
VuFind's XSLT tool includes support for extracting full text from external documents (PDF, Word, etc.). In order to take advantage of this, you need to install and configure a full-text extraction tool.
For an example of full text extraction in action in VuFind, see the full text settings near the bottom of the VuDL Sample XSLT File.
If you need to load a number of XML files at once, you can load them into a subdirectory under the harvest subdirectory of your VuFind installation and use the batch-import-xsl.sh script to load them all. This is commonly used in combination with OAI-PMH harvesting.
You can learn more about XML indexing through the Indexing XML Records video.