Thanks to Mariyapillai Jayakananthan and Mohan Pradhan for developing this documentation.
Instead of using manual harvesting, it is possible to implement incremental auto-harvesting and indexing. The details below outline the process.
Some notes:
Create a script with the filename harvest.sh. You can put this script anywhere you like; $VUFIND_LOCAL_DIR/cron might be a good choice. We will use that for example purposes on this page.
#!/bin/bash # Set up necessary environment variables export PATH=/bin:/usr/bin:/usr/local/bin source /etc/profile.d/vufind.sh # Koha source defined in oai.ini; change as needed OAI_KOHA_SOURCE=Koha # Harvest new records: php $VUFIND_HOME/harvest/harvest_oai.php $OAI_KOHA_SOURCE # Process harvested Koha records: $VUFIND_HOME/harvest/batch-import-marc.sh $OAI_KOHA_SOURCE $VUFIND_HOME/harvest/batch-delete.sh $OAI_KOHA_SOURCE # Rebuild index $VUFIND_HOME/index-alphabetic-browse.sh
Make sure the file is owned by an appropriate user (generally, it is best to have a specific account reserved for VuFind®-related processes – see creating a systema ccount for VuFind® for details). If necessary, change file ownership, e.g.: chown vufind:vufind harvest.sh
.
Make the file executable by running this command: chmod u+x harvest.sh
.
When ready, you can run the script with this command, if you are in the directory where it resides and logged in to an account with appropriate permissions: ./harvest.sh
.
You may wish to set up a cronjob for auto-harvesting and adding records in VuFind®.
To do so, place the auto-harvesting script file described above into the $VUFIND_LOCAL_DIR/cron directory, switch to the account that runs VuFind®-related processing and run crontab -e
to edit the appropriate crontab file.
You should then add a line similar to:
15 20 * * * /usr/local/vufind/local/cron/harvest.sh
(This example runs the script every day at 8:15pm; see Using Cron for more details on how this works).
It is possible that your VuFind® instance harvests records from multiple sources. In that case, you can feel free to add additional harvesting and indexing steps to the script.
For example, if you ingest DSpace records along with Koha records, you could modify the script to something like this:
#!/bin/bash # Set up necessary environment variables export PATH=/bin:/usr/bin:/usr/local/bin source /etc/profile.d/vufind.sh # The names of the harvest sections in oai.ini; change as needed: OAI_DSPACE_SOURCE=DSpace OAI_KOHA_SOURCE=Koha # Harvest new records (do not specify a harvest source in order to harvest all sources): php $VUFIND_HOME/harvest/harvest_oai.php # Process harvested DSpace records: $VUFIND_HOME/harvest/batch-import-xsl.sh $OAI_DSPACE_SOURCE dspace.properties $VUFIND_HOME/harvest/batch-delete.sh $OAI_DSPACE_SOURCE # Process harvested Koha records: $VUFIND_HOME/harvest/batch-import-marc.sh $OAI_KOHA_SOURCE $VUFIND_HOME/harvest/batch-delete.sh $OAI_KOHA_SOURCE # Rebuild index $VUFIND_HOME/index-alphabetic-browse.sh
As another example, if you harvest from multiple Koha instances using different configuration files for each instance, your script might look like this:
#!/bin/bash # Set up necessary environment variables export PATH=/bin:/usr/bin:/usr/local/bin source /etc/profile.d/vufind.sh # Harvest new records (do not specify a harvest source in order to harvest all sources): php $VUFIND_HOME/harvest/harvest_oai.php # Process harvested Koha records from two sources -- this example uses the names "Pasantha" # and "Kohulan" but of course these should be changed to your local instance names in practice. # You could repeat the import/delete lines for any number of sources, as long as each pair uses # an appropriate configuration file and directory name. Note that if you applying a different # ID prefix to each source, additional work may be needed to pre-process deleted record IDs so # that batch-delete.sh works as expected. $VUFIND_HOME/harvest/batch-import-marc.sh -p /usr/local/vufind/local/import/import-Pasantha.properties Koha_Pasantha $VUFIND_HOME/harvest/batch-delete.sh Koha_Pasantha $VUFIND_HOME/harvest/batch-import-marc.sh -p /usr/local/vufind/local/import/import-Kohulan.properties Koha_Kohulan $VUFIND_HOME/harvest/batch-delete.sh Koha_Kohulan # Rebuild index $VUFIND_HOME/index-alphabetic-browse.sh