About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
administration:automation:koha

VuFind® / Koha Automation

Thanks to Mariyapillai Jayakananthan and Mohan Pradhan for developing this documentation.

Auto-incremental harvesting and indexing

Instead of using manual harvesting, it is possible to implement incremental auto-harvesting and indexing. The details below outline the process.

Some notes:

  • In examples, it is assumed that your oai.ini configuration includes a [Koha] section for harvesting records from Koha. You should change the OAI_KOHA_SOURCE variable if you use a different name.
  • The example script assumes that VuFind®'s environment variables are defined in /etc/profile.d/vufind.sh (the default location set up by the .deb package installation). If your setup is different, adjustments may be needed.

Script for incremental OAI-PMH harvesting and indexing with VuFind®

Create a script with the filename harvest.sh. You can put this script anywhere you like; $VUFIND_LOCAL_DIR/cron might be a good choice. We will use that for example purposes on this page.

#!/bin/bash
# Set up necessary environment variables
export PATH=/bin:/usr/bin:/usr/local/bin
source /etc/profile.d/vufind.sh
 
# Koha source defined in oai.ini; change as needed
OAI_KOHA_SOURCE=Koha
 
# Harvest new records:
php $VUFIND_HOME/harvest/harvest_oai.php $OAI_KOHA_SOURCE
 
# Process harvested Koha records:
$VUFIND_HOME/harvest/batch-import-marc.sh $OAI_KOHA_SOURCE
$VUFIND_HOME/harvest/batch-delete.sh $OAI_KOHA_SOURCE
 
# Rebuild index
$VUFIND_HOME/index-alphabetic-browse.sh

Make sure the file is owned by an appropriate user (generally, it is best to have a specific account reserved for VuFind®-related processes – see creating a systema ccount for VuFind® for details). If necessary, change file ownership, e.g.: chown vufind:vufind harvest.sh.

Make the file executable by running this command: chmod u+x harvest.sh.

When ready, you can run the script with this command, if you are in the directory where it resides and logged in to an account with appropriate permissions: ./harvest.sh.

Auto-harvesting via cronjob

You may wish to set up a cronjob for auto-harvesting and adding records in VuFind®.

To do so, place the auto-harvesting script file described above into the $VUFIND_LOCAL_DIR/cron directory, switch to the account that runs VuFind®-related processing and run crontab -e to edit the appropriate crontab file.

You should then add a line similar to:

15 20 * * * /usr/local/vufind/local/cron/harvest.sh 

(This example runs the script every day at 8:15pm; see Using Cron for more details on how this works).

Incorporating additional sources

It is possible that your VuFind® instance harvests records from multiple sources. In that case, you can feel free to add additional harvesting and indexing steps to the script.

Koha + DSpace example

For example, if you ingest DSpace records along with Koha records, you could modify the script to something like this:

#!/bin/bash
# Set up necessary environment variables
export PATH=/bin:/usr/bin:/usr/local/bin
source /etc/profile.d/vufind.sh
 
# The names of the harvest sections in oai.ini; change as needed:
OAI_DSPACE_SOURCE=DSpace
OAI_KOHA_SOURCE=Koha
 
# Harvest new records (do not specify a harvest source in order to harvest all sources):
php $VUFIND_HOME/harvest/harvest_oai.php
 
# Process harvested DSpace records:
$VUFIND_HOME/harvest/batch-import-xsl.sh $OAI_DSPACE_SOURCE dspace.properties
$VUFIND_HOME/harvest/batch-delete.sh $OAI_DSPACE_SOURCE
 
# Process harvested Koha records:
$VUFIND_HOME/harvest/batch-import-marc.sh $OAI_KOHA_SOURCE
$VUFIND_HOME/harvest/batch-delete.sh $OAI_KOHA_SOURCE
 
# Rebuild index
$VUFIND_HOME/index-alphabetic-browse.sh

Multiple Koha instances example

As another example, if you harvest from multiple Koha instances using different configuration files for each instance, your script might look like this:

#!/bin/bash
# Set up necessary environment variables
export PATH=/bin:/usr/bin:/usr/local/bin
source /etc/profile.d/vufind.sh
 
# Harvest new records (do not specify a harvest source in order to harvest all sources):
php $VUFIND_HOME/harvest/harvest_oai.php
 
# Process harvested Koha records from two sources -- this example uses the names "Pasantha"
# and "Kohulan" but of course these should be changed to your local instance names in practice.
# You could repeat the import/delete lines for any number of sources, as long as each pair uses
# an appropriate configuration file and directory name. Note that if you applying a different
# ID prefix to each source, additional work may be needed to pre-process deleted record IDs so
# that batch-delete.sh works as expected.
$VUFIND_HOME/harvest/batch-import-marc.sh -p /usr/local/vufind/local/import/import-Pasantha.properties Koha_Pasantha
$VUFIND_HOME/harvest/batch-delete.sh Koha_Pasantha
$VUFIND_HOME/harvest/batch-import-marc.sh -p /usr/local/vufind/local/import/import-Kohulan.properties Koha_Kohulan
$VUFIND_HOME/harvest/batch-delete.sh Koha_Kohulan
 
# Rebuild index
$VUFIND_HOME/index-alphabetic-browse.sh
administration/automation/koha.txt · Last modified: 2023/05/03 16:40 by demiankatz