About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
indexing:xml

XML Records

If the data you want to import is not available in MARC format, chances are that you can access it in some flavor of XML. Fortunately, loading XML into VuFind®'s index is straightforward if you are familiar with the XSLT language – you simply need to translate from the XML format you have available into Solr's XML Message Format, then post the result to the Solr server.

Importing with XSLT

VuFind®'s XSLT tool is designed to make posting XSLT-transformed documents to the Solr index simple while offering flexibility for extending XSLT and applying local customizations.

The Basics

The XSLT tool is driven by a properties file which provides a few key pieces of information:

  • The name of the XSLT file to use.
  • The names of custom PHP functions and classes that will be called by the XSLT file.
  • Any custom values you want to pass in as parameters to the XSLT file (i.e. local institution names, ID prefixes, etc.)

You can see an example properties file here. The comments in this example file explain the available settings.

Once a properties file is set up, you can import an XML file by switching to the import subdirectory of your VuFind® installation and typing:

php import-xsl.php myFile.xml mySettings.properties

(substituting the appropriate XML and properties files as needed).

Note on Local Overrides

Note that you do not need to provide the full path to the properties file – the tool will first search your local settings directory and then use the core default file if no customizations are found. You can also optionally override the XSLT file in your local settings directory as needed.

Troubleshooting

The import-xsl.php tool supports a –test-only switch which will show you the result of the XSLT transformation without actually loading any data into your index. This can be helpful for testing and troubleshooting, since it will let you see exactly what is being sent to Solr. To use this, simply insert –test-only (surrounded by spaces) between import-xsl.php and myFile.xml in the example above.

Full Text

VuFind®'s XSLT tool includes support for extracting full text from external documents (PDF, Word, etc.). In order to take advantage of this, you need to install and configure a full-text extraction tool.

For an example of full text extraction in action in VuFind®, see the full text settings near the bottom of the VuDL Sample XSLT File.

Batch Importing

If you need to load a number of XML files at once, you can load them into a subdirectory under the harvest subdirectory of your VuFind® installation and use the batch-import-xsl.sh script to load them all. This is commonly used in combination with OAI-PMH harvesting.

You can learn more about XML indexing through the Indexing XML Records video.

indexing/xml.txt · Last modified: 2023/11/02 13:16 by demiankatz