VuFind was initially designed with the MARC bibliographic record format in mind, though additional formats are supported through the use of Record Drivers starting with version 1.0. For general information on MARC, see Understanding MARC Bibliographic from the Library of Congress. The Code4Lib Working with MARC page also provides some useful tools.
VuFind comes packaged with the SolrMarc tool for importing MARC records. Follow these steps to take advantage of it.
Before you can load the records into VuFind, you need to get them out of your Integrated Library System (ILS). If you are just testing VuFind, you can also download sample records from sources listed lower on this page.
Every ILS has a different procedure for exporting records, and detailing all of them is beyond the scope of this document. Check your ILS documentation or talk to your vendor if you need help. You can also check the MARC Export Notes page to see if there are notes specific to your ILS; please consider adding to the page if you have knowledge to share. If you still need help, you can always ask on the mailing lists on the Support page – the VuFind community is always happy to help when it can.
Keep these notes in mind to ensure that your records can be imported without any problems:
The import tool relies on settings in import/import.properties. If this is the first time you are indexing, make sure that file paths and URLs in this file are correct for your setup. For more details on what everything means, see the SolrMarc documentation.
To begin an import, follow the platform-specific instructions listed below. This may take hours or days for very large data sets!
Switch to your VuFind installation directory and run:
Note: In versions of VuFind prior to 1.0RC2, import-marc.sh was named import.sh.
Switch to your VuFind installation directory and run:
Note: prior to VuFind 1.0RC2, import-marc.bat was not available, and it was necessary to run SolrMarc manually:
Java -Xms256m -Xmx256m -Dsolr.core.name=biblio -Dsolrmarc.path=C:/vufind/import -Dsolr.path=C:/vufind/solr -Dmarc.path=c:/vufind/import/catdump.mrc -jar c:/vufind/import/dist/MarcImporter.jar c:/vufind/import/import.properties
(thanks to mike_beccaria)
The following optional feature was introduced after the release of VuFind 1.0.1. If you want to take advantage of it without upgrading the rest of VuFind, you can download updated scripts from the trunk here.
In both Linux and Windows, you can use the optional ”-p” switch to override SolrMarc's default import.properties file with a different file. For example:
./import-marc.sh -p /usr/local/vufind/import/custom.properties your_records.mrc
This may be useful if you need to import different sets of records using different mappings.
Starting with VuFind 1.1, it is also possible to import authority records into VuFind's separate authority index (see the authority control page for more details). A special tool (import-marc-auth.sh under Linux, import-marc-auth.bat under Windows) is provided to help with this. This works exactly like the standard import-marc script, except the SolrMarc settings are found in import/import_auth.properties, the default MARC mappings are found in import/marc_auth.properties, and you may provide a second parameter after the MARC filename to specify a set of additional MARC mappings to override the defaults in marc_auth.properties.
Authority data is currently used in two ways: it can be searched through the simple Authority module (found at http://your_server/vufind/Authority/Home), and it provides “see also” and “use instead” references within the index generated by the Alphabetical Heading Browse feature. Additionally, you can choose to activate the Authority Recommend module which will provide Search recommendations to users based on a search of the Authority Index for their current search terms. E.g., if users search for a known pseudonym, the Authority Recommend module will suggest that they search for the registered heading instead.
If you have trouble importing authority records under Windows, it may have to do with the classpath settings in some of the .bsh files found in the import/index_scripts subdirectory of your VuFind installation. Try changing the addClassPath(”../import”); lines to addClassPath(“c:/vufind/import”); where “c:/vufind/import” is the path to the import subdirectory of your VuFind installation. Note the use of forward slashes – this is acceptable and simplifies escaping issues, even in the Windows environment.
If the imported records do not show up in VuFind immediately, you will have to restart the program as described here.
For improved performance (and, if applicable, correct spellchecker behavior), it is a good idea to optimize your Solr index after you import records.
See the SolrMarc page for more details on how you can customize the behavior of the import process to meet your needs.
Starting with VuFind 1.2, it is possible to harvest full text from URLs found in MARC records. This requires that you first install a full-text extraction tool and then uncomment the appropriate fulltext line in import/marc_local.properties. Comments in the property file explain exactly how the functionality works. Full text indexing is disabled by default.
This section is for listing sources of binary MARC records helpful for testing purposes if you want to try VuFind without using your own records:
If the data you want to import is not available in MARC format, chances are that you can access it in some flavor of XML. Fortunately, loading XML into VuFind's index is straightforward if you are familiar with the XSLT language – you simply need to translate from the XML format you have available into Solr's XML Message Format, then post the result to the Solr server.
IMPORTANT: The XSLT tool described in this section was added in VuFind 1.1. If you are using an earlier version, you will have to upgrade.
VuFind's XSLT tool is designed to make posting XSLT-transformed documents to the Solr index simple while offering flexibility for extending XSLT and applying local customizations.
The XSLT tool is driven by a properties file which provides a few key pieces of information:
You can see an example properties file here. The comments in this example file explain the available settings.
Once a properties file is set up, you can import an XML file by switching to the import subdirectory of your VuFind installation and typing:
php import-xsl.php myFile.xml mySettings.properties
(substituting the appropriate XML and properties files as needed).
VuFind's XSLT tool includes support for extracting full text from external documents (PDF, Word, etc.). In order to take advantage of this, you need to install and configure a full-text extraction tool.
For an example of full text extraction in action in VuFind, see the full text settings near the bottom of the VuDL Sample XSLT File.
If you need to load a number of XML files at once, you can load them into a subdirectory under the harvest subdirectory of your VuFind installation and use the batch-import-xsl.sh script to load them all. This is commonly used in combination with OAI-PMH harvesting (described below).
Starting with VuFind 1.0.1, a simple tool is included for harvesting records using the OAI-PMH protocol.
To set up OAI-PMH harvesting, simply edit the oai.ini file in the harvest subdirectory of your VuFind installation. You can set up one or more OAI-PMH repositories here – details are included in comments within the file.
Once OAI-PMH is configured, you can follow these steps to get documents from an OAI-PMH repository into your VuFind index:
It should be possible to automate this process using a top-level script and cron job in order to do a nightly harvest/index operation.