Warning: This page has not been updated in over over a year and may be outdated or deprecated.
videos:oai-pmh_server_and_harvest_functionality
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
videos:oai-pmh_server_and_harvest_functionality [2020/04/20 18:52] – created demiankatz | videos:oai-pmh_server_and_harvest_functionality [2023/04/26 13:34] (current) – crhallberg | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Video 6: OAI-PMH Server and Harvest Functionality ====== | ====== Video 6: OAI-PMH Server and Harvest Functionality ====== | ||
- | The sixth VuFind | + | The sixth VuFind® |
Video is available as an [[https:// | Video is available as an [[https:// | ||
Line 9: | Line 9: | ||
- [[indexing: | - [[indexing: | ||
- [[indexing: | - [[indexing: | ||
+ | - [[https:// | ||
===== Transcript ===== | ===== Transcript ===== | ||
- | // Coming soon... // | + | Hello and welcome to this VuFind tutorial video, in which I am going to talk about how VuFind uses the OAI-PMH protocol to both share and receive records. |
+ | |||
+ | OAI-PMH is the open archives initiative protocol for metadata harvesting and is a well supported and widely used method of sharing xml metadata between systems. It supports not just harvesting entire collections of metadata but also doing incremental harvests so you can get only things that have changed since your prior harvest, and it can also address deleted records so you can find out what has been removed from an upstream system. The protocol always supports Dublin core metadata but it also can support any kind of XML format as well. The server and client are both able to deal with the same standard. | ||
+ | |||
+ | First of all I am going to show you how you can turn on VuFind' | ||
+ | |||
+ | That's all I need to do to turn on the basic functionality but there are a few things here that I would probably also want to do like give the name, and you can set a separate administrative email for your OAI server or otherwise it will use the default email address. | ||
+ | |||
+ | There are also some settings related to sets since OAI servers can divide a collection into specific sets. You can use a Solr field like a facet for defining sets or you can specify particular named sets with particular queries associated with them if you want to allow people to harvest specific subsets of your collection, but if you just leave all this stuff commented out then set functionality will be disabled and people will only be able to harvest your entire collection. | ||
+ | |||
+ | There is another important step though that you have to take before you can use OAI-PMH server capabilities in VuFind, and that is to turn on record change tracking because the OAI-PMH protocol needs to know the history of when everything in your system was created or changed so that it can do incremental updates. VuFind needs to track more information at index time so that the server has the information that it needs. By default, VuFind does not track record change information because doing so makes the index process slower, but if you do turn this on you not only get the benefit of being able to use the OAI-PMH server but you also gain access to some other functionality that otherwise won't work including RSS feeds that are sorted based on actual record creation times and the ability to use Solr-based new record searching where you can actually limit your search by how recently records were added to the index. | ||
+ | |||
+ | To turn this on you just need to uncomment a couple of lines in the default marc_local.properties file, so I'm going to bring that up. This is the same file that we've worked on. You can see here near the top there are two lines, first_indexed and last_indexed, | ||
+ | |||
+ | Of course, simply making a change to my marc_local.properties file is not enough. I also index all of my records and just in keeping with past demos I'm going to index 3 of the sample MARC record files included with VuFind: journals.mrc, | ||
+ | |||
+ | Of course I've showed you how to turn on change tracking for MARC records. At some point in the future we'll also index XML. When we get that far you can also turn on change tracking there, it's just done in a different way. For now we've got our index updated the way we need it to be. We have the OAI server functionality turned on in config.ini, so I'm going to switch over to a web browser and show you how this works. | ||
+ | |||
+ | If you go to your VuFind URL with /oai on the end of it you will get to a convenient page that shows you all of the verbs supported by the OAI-PMH protocol. It lets you test them out on your instance, so for example, the most simple thing you can do is just say " | ||
+ | |||
+ | Of course, much more interesting is finding out what kind of metadata formats are supported by an OAI-PMH server. As I mentioned before, they always support Dublin Core but different formats may be supported by different servers so in a view case I'm just going to give one of the records and the index and find out what formats are supported. | ||
+ | |||
+ | So here is the Oei DC which is dublin core but you'll also see there' | ||
+ | |||
+ | And as you can see, there' | ||
+ | |||
+ | So now that we've showed how OAI-PMH server functionality works, let's show what VuFind can do as an OAI-PMH client and actually make it harvest itself as an example. | ||
+ | |||
+ | So going back to the command line, there is a folder we haven' | ||
+ | |||
+ | One of the important files under harvest is called oai.ini which is just an inny file that you can use to set up OAI harvesting. So I'm going to copy harvest/oai.ini and I into local harvest local copy that I local settings directory. | ||
+ | |||
+ | So oai.ini has lots of comments at the top and the many many settings that are supported by this file. Through those at your convenience. At a bare minimum, all you need to do to perform an OAI harvest is to create a section named you find because we are are you find and the main purpose of the section name is that records that are harvested will be saved in a directory whose name matches the section. | ||
+ | |||
+ | When I perform a harvest, I will end up with a local/ | ||
+ | |||
+ | I also have to provide a metadata prefix telling it what metadata format to harvest and in this example, I actually just want to see what the Dublin OAI-DC. Save this file. Once you have your oai.ini set up, there is a PHP script called harvest/ | ||
+ | |||
+ | Or, you can tell it the name of a specific section and it will be that one repository. I'll do that. I'll tell it harvest vufind. Now there we go, it just downloaded 250 Dublin Core records in just a couple of seconds. | ||
+ | |||
+ | So now, if I go into my local/ | ||
+ | |||
+ | So, that's all I wanted to show this month. This will become much more interesting when we talk about ingesting XML because you can harvest with OAI and then load a whole directory of records into VuFind. We will look at that next time. In the meantime, I also just wanted to quickly mention that if you want to do this OAI-PMH harvesting without having to install all of VuFind, it has actually been split out into a separate project called VuFind Harvest. So, you can just check out VuFind Harvest and run a simplified version of the script without having to carry the whole way to VuFind around with you. And, I will include a link to that project in the notes with the video. That's all for now. Thank you, and I will provide more information next month. | ||
+ | |||
+ | //This is an edited version of an automated transcript. Apologies for any errors.// | ||
---- struct data ---- | ---- struct data ---- | ||
+ | properties.Page Owner : | ||
---- | ---- | ||
videos/oai-pmh_server_and_harvest_functionality.1587408752.txt.gz · Last modified: 2020/04/20 18:52 by demiankatz