VuFind was initially designed with the MARC bibliographic record format in mind, though additional formats are supported through the use of Record Drivers starting with version 1.0. For general information on MARC, see Understanding MARC Bibliographic from the Library of Congress. The Code4Lib Working with MARC page also provides some useful tools.
VuFind comes packaged with the SolrMarc tool for importing MARC records. Follow these steps to take advantage of it.
Before you can load the records into VuFind, you need to get them out of your Integrated Library System (ILS). If you are just testing VuFind, you can also download sample records from sources listed lower on this page.
Every ILS has a different procedure for exporting records, and detailing all of them is beyond the scope of this document. Check your ILS documentation or talk to your vendor if you need help. You can also check the MARC Export Notes page to see if there are notes specific to your ILS; please consider adding to the page if you have knowledge to share. If you still need help, you can always ask on the mailing lists on the Support page – the VuFind community is always happy to help when it can.
Keep these notes in mind to ensure that your records can be imported without any problems:
Export your records in binary (ISO2709) MARC format, not human-readable
ASCII. If for some reason you cannot export the records in binary form, you can use a tool like yaz-marcdump from the
YAZ toolkit to convert one MARC format to another.
Make sure your resulting file has a ”.mrc” extension. Most versions of
SolrMarc require this extension, so it is a good practice to use it just to be on the safe side.
Each exported record must contain a unique identifier so that VuFind can tell it apart from the others. We recommend including your
ILS's bibliographic record ID in the exported data for this purpose; you may need to add a special configuration option to your
ILS's exporter to make this happen. VuFind's importer expects to find the unique ID in the 001 field, but you can customize this by editing the marc.properties file (for more details, see
Customizing Import Mappings).
To begin an import, follow the platform-specific instructions listed below. This may take hours or days for very large data sets!
Switch to your VuFind installation directory and run:
./import-marc.sh your_records_file.mrc
Note: In versions of VuFind prior to 1.0RC2, import-marc.sh was named import.sh.
Switch to your VuFind installation directory and run:
import-marc.bat your_records_file.mrc
Note: prior to VuFind 1.0RC2, import-marc.bat was not available, and it was necessary to run SolrMarc manually:
Java -Xms256m -Xmx256m -Dsolr.core.name=biblio -Dsolrmarc.path=C:/vufind/import -Dsolr.path=C:/vufind/solr -Dmarc.path=c:/vufind/import/catdump.mrc -jar c:/vufind/import/dist/MarcImporter.jar c:/vufind/import/import.properties
(thanks to mike_beccaria)
If the imported records do not show up in VuFind immediately, you will have to restart the program as described here.
For improved performance (and, if applicable, correct spellchecker behavior), it is a good idea to optimize your Solr index after you import records.
See the SolrMarc page for more details on how you can customize the behavior of the import process to meet your needs.
This section is for listing sources of binary MARC records helpful for testing purposes if you want to try VuFind without using your own records:
Starting with VuFind 1.0.1, a simple tool is included for harvesting records using the OAI-PMH protocol.
To set up OAI-PMH harvesting, simply edit the oai.ini file in the harvest subdirectory of your VuFind installation. You can set up one or more OAI-PMH repositories here – details are included in comments within the file.
Once OAI-PMH is configured, you can follow these steps to get documents from an OAI-PMH repository into your VuFind index:
Run the harvester by switching to the harvester subdirectory of your VuFind installation and running “php harvest_oai.php”. If you configured multiple repositories and want to harvest from just one, you can add the name of the repository (as specified as a section header in oai.ini) as a parameter to limit your harvesting.
For each OAI-PMH repository you harvested, a number of files will have been created in a subdirectory of harvest whose name matches the appropriate section of the oai.ini configuration file.
Run the ./batch-delete.sh file (with a harvest subdirectory name as a parameter) to remove any records from your index that have been reported as deleted by the OAI-PMH server.
Run the ./batch-import-marc.sh file (with a harvest subdirectory name as a parameter) to index all MARC records harvested from an OAI-PMH server.
After all deleted and new records have been processed, the records retrieved from the OAI-PMH server will have been moved to a “processed” subdirectory of their containing directory. You can periodically clear out this directory if you no longer feel you need to retain records. However, it may be useful to keep them, since you can always move them back up a directory level and re-run the batch processing scripts in order to reindex everything.
A “last_harvest.txt” file is created in each OAI-PMH harvest directory to keep track of the most recent harvest. This allows subsequent harvest operations to pick up where previous ones left off. To reindex all records, you can simply delete this file. Note that it is normal for some duplicate records to be retrieved on subsequent harvests – new harvests overlap slightly with the previous set in order to ensure that nothing is missed.
It should be possible to automate this process using a top-level script and cron job in order to do a nightly harvest/index operation.
Processing a large number of MARC files is currently very slow, since records are processed one file at a time. It may be worth developing a new tool to merge all the MARC records into a single file as an intermediate step before indexing them.
The batch processing scripts are currently available for Linux only; however, it should be possible to develop Windows versions if necessary.
OAI-PMH may provide formats other than MARC. Tools are currently in development to help index such records. Feel free to post on
vufind-tech if you want to participate in development of these tools.
Automation - Notes on automating VuFind, including how to regularly load the latest records.
Re-indexing - How to clear out your index if you want to start over.