Warning: This page has not been updated in over over a year and may be outdated or deprecated.
videos:oai-pmh_server_and_harvest_functionality
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
videos:oai-pmh_server_and_harvest_functionality [2020/12/23 13:08] – [Transcript] demiankatz | videos:oai-pmh_server_and_harvest_functionality [2023/04/26 13:34] (current) – crhallberg | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Video 6: OAI-PMH Server and Harvest Functionality ====== | ====== Video 6: OAI-PMH Server and Harvest Functionality ====== | ||
- | The sixth VuFind | + | The sixth VuFind® |
Video is available as an [[https:// | Video is available as an [[https:// | ||
Line 13: | Line 13: | ||
===== Transcript ===== | ===== Transcript ===== | ||
- | // This is a raw machine-generated transcript; it has been partially cleaned up, but more work needs to be done on the later parts of the text. // | + | Hello and welcome to this VuFind tutorial video, in which I am going to talk about how VuFind uses the OAI-PMH protocol to both share and receive records. |
- | Hello and welcome to this VuFind | + | OAI-PMH |
- | tutorial video, in which I am going to | + | |
- | talk about how VuFind uses the | + | |
- | OAI-PMH protocol | + | |
- | receive | + | |
- | OAI-PMH | + | First of all I am going to show you how you can turn on VuFind' |
- | archives initiative protocol for | + | |
- | metadata harvesting | + | |
- | supported and widely used method of | + | |
- | sharing xml metadata between systems. | + | |
- | It supports not just harvesting entire | + | |
- | collections of metadata but also doing | + | |
- | incremental harvests so you can get only | + | |
- | things | + | |
- | prior harvest, and it can also address | + | |
- | deleted records | + | |
- | has been removed from an upstream system. | + | |
- | The protocol always supports Dublin core | + | |
- | metadata but it also can support any | + | |
- | kind of XML format as well. The server | + | |
- | and client are both able to deal with | + | |
- | the same standard. | + | |
- | First of all I am | + | That' |
- | going to show you how you can turn on | + | |
- | VuFind' | + | |
- | to the command line and I'm going to | + | |
- | edit my local config.ini file and | + | |
- | you'll see that | + | |
- | in the default configuration that comes | + | |
- | with VuFind the entire [OAI] section is | + | |
- | commented out, so by deleting this | + | |
- | semicolon | + | |
- | header I have now activated my OAI-PMH | + | |
- | server. | + | |
- | That's all I need to do to turn on | + | There are also some settings related |
- | the basic functionality but there are a | + | |
- | few things here that I would probably | + | |
- | also want to do like give the name, and | + | |
- | you can set a separate administrative | + | |
- | email for your OAI server | + | |
- | it will use the default email address. | + | |
- | There are also some settings related | + | There is another important step though that you have to take before |
- | OAI servers can divide a collection into | + | |
- | specific sets. You can use a Solr field like a | + | |
- | facet for defining sets or you can | + | |
- | specify particular named sets with | + | |
- | particular queries associated with them | + | |
- | if you want to allow people | + | |
- | specific subsets | + | |
- | if you just leave all this stuff | + | |
- | commented out then set functionality | + | |
- | will be disabled and people will only be | + | |
- | able to harvest | + | |
- | There is another important step though | + | To turn this on you just need to uncomment a couple of lines in the default marc_local.properties file, so I'm going to bring that up. This is the same file that we've worked on. You can see here near the top there are two lines, first_indexed and last_indexed, |
- | that you have to take before you can use | + | |
- | OAI-PMH server capabilities | + | |
- | and that is to turn on record | + | |
- | change tracking | + | |
- | protocol needs to know the history | + | |
- | when everything in your system was | + | |
- | created or changed so that it can do | + | |
- | incremental updates. VuFind needs to track | + | |
- | more information at index time so that | + | |
- | the server has the information | + | |
- | needs. By default, VuFind does not | + | |
- | track record | + | |
- | doing so makes the index process slower, | + | |
- | but if you do turn this on you not only | + | |
- | get the benefit | + | |
- | OAI-PMH server but you also gain access | + | |
- | to some other functionality that | + | |
- | otherwise won't work including RSS feeds | + | |
- | that are sorted based on actual | + | |
- | record | + | |
- | use Solr-based new record searching where you can | + | |
- | actually limit your search by how | + | |
- | recently records were added to the index. | + | |
- | To turn this on you just need to | + | Of course, simply making a change |
- | uncomment a couple of lines in the | + | |
- | default | + | |
- | I'm going to bring that up. This is | + | |
- | the same file that we've worked on. | + | |
- | You can see here near the top there are | + | |
- | two lines, first_indexed | + | |
- | and just by uncommenting these I turn on | + | |
- | change tracking. The difference between | + | |
- | these two fields is that the first_indexed | + | |
- | field will contain the date of | + | |
- | the first time a particular record ID | + | |
- | was indexed into the system and the last_indexed | + | |
- | date will contain the most | + | |
- | recent time that record changed, so when | + | |
- | you index a record for the first time | + | |
- | first_indexed and last_indexed will be set, | + | |
- | but if that record gets revised over | + | |
- | time last_indexed will change to reflect | + | |
- | those changes but first_indexed will | + | |
- | always stay the same so you know the age | + | |
- | of the overall record as well as the | + | |
- | date of its most recent change and this | + | |
- | is sort of the minimum amount of | + | |
- | information needed to implement OAI-PMH. | + | |
- | Of course, simply making a change | + | Of course |
- | marc_local.properties file is not enough. I | + | |
- | also index all of my records and just in | + | |
- | keeping with past demos I'm going to | + | |
- | index 3 of the sample MARC record files | + | |
- | included with VuFind: journals.mrc, | + | |
- | geo.mrc and authoritybibs.mrc. | + | |
- | Of course I've showed | + | If you go to your VuFind URL with /oai on the end of it you will get to a convenient page that shows you all of the verbs supported by the OAI-PMH protocol. It lets you test them out on your instance, so for example, the most simple thing you can do is just say " |
- | change tracking for MARC records. At some | + | |
- | point in the future we'll also index XML. | + | |
- | When we get that far you can also turn on change | + | |
- | tracking there, it' | + | |
- | different way. For now we've got our | + | |
- | index updated | + | |
- | We have the OAI server functionality | + | |
- | turned on in config.ini, so I'm going to | + | |
- | switch over to a web browser and show | + | |
- | you how this works. | + | |
- | If you go to your | + | Of course, much more interesting is finding out what kind of metadata formats are supported by an OAI-PMH |
- | VuFind URL with /oai on the end | + | |
- | of it you will get to a convenient page | + | |
- | that shows you all of the verbs | + | |
- | supported by the OAI-PMH | + | |
- | you test them out on your instance, so | + | |
- | for example, the most simple thing you | + | |
- | can do is just say " | + | |
- | dump out basic information about the | + | |
- | server | + | |
- | repo" repository name I put into config.ini | + | |
- | comes through here. | + | |
- | Of course, | + | So here is the Oei DC which is dublin core but you'll also see there' |
- | much more interesting is finding out | + | |
- | what kind of metadata formats are | + | And as you can see, there' |
- | supported by an OAI-PMH server. As I | + | |
- | mentioned before, they always support | + | So now that we've showed how OAI-PMH server functionality works, let's show what VuFind can do as an OAI-PMH |
- | Dublin Core but different formats may be | + | |
- | supported by different servers so in a | + | So going back to the command line, there is a folder we haven' |
- | view case I'm just going to give one of | + | |
- | the records and the index and find out | + | One of the important files under harvest is called oai.ini which is just an inny file that you can use to set up OAI harvesting. So I'm going to copy harvest/oai.ini |
- | what formats are supported so here is | + | |
- | the Oei DC which is dublin core but | + | So oai.ini has lots of comments at the top and the many many settings that are supported by this file. Through |
- | you'll also see there' | + | |
- | supported | + | When I perform a harvest, I will end up with a local/harvest/you find directory filled with XML files. Now I need to give it the base URL of an OAI server. In this case, that's gonna be HTTP localhost/ |
- | and if we wanted to actually see some | + | |
- | records we can use the list records verb | + | I also have to provide a metadata prefix telling it what metadata format to harvest and in this example, I actually just want to see what the Dublin |
- | which at a bare minimum requires that we | + | |
- | give it a metadata format | + | Or, you can tell it the name of a specific section and it will be that one repository. I'll do that. I'll tell it harvest |
- | to give it one hit go and there' | + | |
- | response and as you can see there' | + | So now, if I go into my local/harvest/ |
- | mark XML getting dumped out here so by | + | |
- | turning on this functionality you can | + | So, that's all I wanted to show this month. This will become much more interesting when we talk about ingesting XML because you can harvest with OAI and then load a whole directory of records into VuFind. We will look at that next time. In the meantime, I also just wanted to quickly mention that if you want to do this OAI-PMH |
- | share all of the records in your view | + | |
- | find index with other systems Union | + | //This is an edited version of an automated transcript. Apologies for any errors.// |
- | catalogs participating in projects like | + | |
- | the digital Public Library of America | + | |
- | and also actually indexing things into | + | |
- | VuFind | + | |
- | OAI-PMH server functionality works let's | + | |
- | show what VuFind can do as an AI pmh | + | |
- | client and actually make it harvest | + | |
- | itself as an example | + | |
- | the command line there is a folder we | + | |
- | haven' | + | |
- | the VuFind directory and like just | + | |
- | about everything in VuFind you can | + | |
- | override things from the harvest | + | |
- | directory inside the local harvest | + | |
- | direct so one of the important files | + | |
- | under harvest is called oai ini which is | + | |
- | just an inny file that you can use to | + | |
- | set up Oh a harvesting | + | |
- | copy harvest /oe III and I into local | + | |
- | harvest local copy that I local settings | + | |
- | directory | + | |
- | comments at the top and the many many | + | |
- | settings that are supported by this file | + | |
- | through | + | |
- | bare minimum all you need to do to | + | |
- | perform an OE I harvest is to create a | + | |
- | section | + | |
- | are you find and the main purpose of the | + | |
- | section | + | |
- | harvested will be saved in a directory | + | |
- | whose name matches the section | + | |
- | perform a harvest I will end up with a | + | |
- | local slash harvest | + | |
- | directory filled with XML files now I | + | |
- | need to give it the base URL of an OE I | + | |
- | server | + | |
- | localhost | + | |
- | is the URL that you would share with | + | |
- | others who want to harvest from you | + | |
- | though of course in a real life scenario | + | |
- | the host name would be something other | + | |
- | than localhost I also have to provide a | + | |
- | metadata prefix telling it what metadata | + | |
- | format to harvest and in this example I | + | |
- | actually just want to see what the | + | |
- | Dublin | + | |
- | have your oai and I set up there is a | + | |
- | PHP script called harvest | + | |
- | AI dot PHP and when you run that it will | + | |
- | loop through oai i and i and harvest | + | |
- | every section it volumes | + | |
- | it the name of a specific section and it | + | |
- | will be that one repository I'll do that | + | |
- | I'll tell it harvest | + | |
- | we go it just downloaded 250 Dublin | + | |
- | records in just a couple of seconds | + | |
- | now if I go into my local harvest | + | |
- | find directory and list my files I have | + | |
- | lots and lots of XML files and if I was | + | |
- | out there | + | |
- | is a little bit of Dublin | + | |
- | title and a creator and identifier | + | |
- | that' | + | |
- | month this will become much more | + | |
- | interesting when we talk about ingesting | + | |
- | XML because you can harvest with oai and | + | |
- | then load a whole directory of records | + | |
- | into VuFinds we will look at that | + | |
- | next time in the meantime I also just | + | |
- | wanted to quickly mention that if you | + | |
- | want to do this o AIP MH harvesting | + | |
- | without having to install all of you | + | |
- | find it has actually been split out into | + | |
- | a separate project called | + | |
- | harvest so you can just check out view | + | |
- | find harvest | + | |
- | version of the script without having to | + | |
- | carry the whole way to VuFind around | + | |
- | with you and I will include a link to | + | |
- | that project in the notes with the video | + | |
- | that's all for now | + | |
- | thank you and I will provide more | + | |
- | information next month | + | |
---- struct data ---- | ---- struct data ---- | ||
+ | properties.Page Owner : | ||
---- | ---- | ||
videos/oai-pmh_server_and_harvest_functionality.1608728928.txt.gz · Last modified: 2020/12/23 13:08 by demiankatz