Table of Contents
VuFind® / Koha Automation
Thanks to Mariyapillai Jayakananthan and Mohan Pradhan for developing this documentation.
Auto-incremental harvesting and indexing
Instead of using manual harvesting, it is possible to implement incremental auto-harvesting and indexing. The details below outline the process.
Some notes:
- In examples, it is assumed that your oai.ini configuration includes a [Koha] section for harvesting records from Koha. You should change the OAI_KOHA_SOURCE variable if you use a different name.
- The example script assumes that VuFind®'s environment variables are defined in /etc/profile.d/vufind.sh (the default location set up by the .deb package installation). If your setup is different, adjustments may be needed.
Script for incremental OAI-PMH harvesting and indexing with VuFind®
Create a script with the filename harvest.sh. You can put this script anywhere you like; $VUFIND_LOCAL_DIR/cron might be a good choice. We will use that for example purposes on this page.
#!/bin/bash # Set up necessary environment variables export PATH=/bin:/usr/bin:/usr/local/bin source /etc/profile.d/vufind.sh # Koha source defined in oai.ini; change as needed OAI_KOHA_SOURCE=Koha # Harvest new records: php $VUFIND_HOME/harvest/harvest_oai.php $OAI_KOHA_SOURCE # Process harvested Koha records: $VUFIND_HOME/harvest/batch-import-marc.sh $OAI_KOHA_SOURCE $VUFIND_HOME/harvest/batch-delete.sh $OAI_KOHA_SOURCE # Rebuild index $VUFIND_HOME/index-alphabetic-browse.sh
Make sure the file is owned by an appropriate user (generally, it is best to have a specific account reserved for VuFind®-related processes – see creating a systema ccount for VuFind® for details). If necessary, change file ownership, e.g.: chown vufind:vufind harvest.sh
.
Make the file executable by running this command: chmod u+x harvest.sh
.
When ready, you can run the script with this command, if you are in the directory where it resides and logged in to an account with appropriate permissions: ./harvest.sh
.
Auto-harvesting via cronjob
You may wish to set up a cronjob for auto-harvesting and adding records in VuFind®.
To do so, place the auto-harvesting script file described above into the $VUFIND_LOCAL_DIR/cron directory, switch to the account that runs VuFind®-related processing and run crontab -e
to edit the appropriate crontab file.
You should then add a line similar to:
15 20 * * * /usr/local/vufind/local/cron/harvest.sh
(This example runs the script every day at 8:15pm; see Using Cron for more details on how this works).
Incorporating additional sources
It is possible that your VuFind® instance harvests records from multiple sources. In that case, you can feel free to add additional harvesting and indexing steps to the script.
Koha + DSpace example
For example, if you ingest DSpace records along with Koha records, you could modify the script to something like this:
#!/bin/bash # Set up necessary environment variables export PATH=/bin:/usr/bin:/usr/local/bin source /etc/profile.d/vufind.sh # The names of the harvest sections in oai.ini; change as needed: OAI_DSPACE_SOURCE=DSpace OAI_KOHA_SOURCE=Koha # Harvest new records (do not specify a harvest source in order to harvest all sources): php $VUFIND_HOME/harvest/harvest_oai.php # Process harvested DSpace records: $VUFIND_HOME/harvest/batch-import-xsl.sh $OAI_DSPACE_SOURCE dspace.properties $VUFIND_HOME/harvest/batch-delete.sh $OAI_DSPACE_SOURCE # Process harvested Koha records: $VUFIND_HOME/harvest/batch-import-marc.sh $OAI_KOHA_SOURCE $VUFIND_HOME/harvest/batch-delete.sh $OAI_KOHA_SOURCE # Rebuild index $VUFIND_HOME/index-alphabetic-browse.sh
Multiple Koha instances example
As another example, if you harvest from multiple Koha instances using different configuration files for each instance, your script might look like this:
#!/bin/bash # Set up necessary environment variables export PATH=/bin:/usr/bin:/usr/local/bin source /etc/profile.d/vufind.sh # Harvest new records (do not specify a harvest source in order to harvest all sources): php $VUFIND_HOME/harvest/harvest_oai.php # Process harvested Koha records from two sources -- this example uses the names "Pasantha" # and "Kohulan" but of course these should be changed to your local instance names in practice. # You could repeat the import/delete lines for any number of sources, as long as each pair uses # an appropriate configuration file and directory name. Note that if you applying a different # ID prefix to each source, additional work may be needed to pre-process deleted record IDs so # that batch-delete.sh works as expected. $VUFIND_HOME/harvest/batch-import-marc.sh -p /usr/local/vufind/local/import/import-Pasantha.properties Koha_Pasantha $VUFIND_HOME/harvest/batch-delete.sh Koha_Pasantha $VUFIND_HOME/harvest/batch-import-marc.sh -p /usr/local/vufind/local/import/import-Kohulan.properties Koha_Kohulan $VUFIND_HOME/harvest/batch-delete.sh Koha_Kohulan # Rebuild index $VUFIND_HOME/index-alphabetic-browse.sh