Warning: This page has not been updated in over over a year and may be outdated or deprecated.
videos:oai-pmh_server_and_harvest_functionality
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
videos:oai-pmh_server_and_harvest_functionality [2020/04/20 19:40] – [Related Resources] demiankatz | videos:oai-pmh_server_and_harvest_functionality [2023/04/26 13:34] (current) – crhallberg | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Video 6: OAI-PMH Server and Harvest Functionality ====== | ====== Video 6: OAI-PMH Server and Harvest Functionality ====== | ||
- | The sixth VuFind | + | The sixth VuFind® |
Video is available as an [[https:// | Video is available as an [[https:// | ||
Line 13: | Line 13: | ||
===== Transcript ===== | ===== Transcript ===== | ||
- | // This is a raw machine-generated transcript; it will be cleaned up in the near future. // | + | Hello and welcome to this VuFind tutorial video, |
- | hello and welcome to these you find | + | OAI-PMH is the open archives initiative protocol for metadata harvesting and is a well supported and widely used method of sharing xml metadata between systems. It supports not just harvesting entire collections of metadata but also doing incremental harvests so you can get only things that have changed since your prior harvest, and it can also address deleted records so you can find out what has been removed from an upstream system. The protocol always supports Dublin core metadata but it also can support any kind of XML format as well. The server and client are both able to deal with the same standard. |
- | tutorial video in which I am going to | + | |
- | talk about how have you find uses the | + | First of all I am going to show you how you can turn on VuFind' |
- | oai-pmh protocol to both share and | + | |
- | receive records so oai-pmh is the open | + | That's all I need to do to turn on the basic functionality but there are a few things here that I would probably also want to do like give the name, and you can set a separate administrative email for your OAI server or otherwise it will use the default email address. |
- | archives initiative protocol for | + | |
- | metadata harvesting and is a well | + | There are also some settings related to sets since OAI servers can divide a collection into specific sets. You can use a Solr field like a facet for defining sets or you can specify particular |
- | supported and widely used method of | + | |
- | sharing xml metadata between systems | + | There is another important step though that you have to take before you can use OAI-PMH server capabilities |
- | it supports not just harvesting entire | + | |
- | collections of metadata but also doing | + | To turn this on you just need to uncomment a couple of lines in the default |
- | incremental harvests so you can get only | + | |
- | things that have changed since your | + | Of course, |
- | prior harvest and it can also address | + | |
- | deleted records so you can find out what | + | Of course I've showed you how to turn on change tracking for MARC records. At some point in the future we'll also index XML. When we get that far you can also turn on change tracking there, it's just done in a different way. For now we've got our index updated the way we need it to be. We have the OAI server functionality turned on in config.ini, so I'm going to switch over to a web browser and show you how this works. |
- | has been removed from an upstream system | + | |
- | the protocol always supports Dublin core | + | If you go to your VuFind |
- | metadata but it also can support any | + | |
- | kind of XML format as well the server | + | Of course, much more interesting is finding out what kind of metadata formats are supported by an OAI-PMH |
- | and client are both able to deal with | + | |
- | the same standard | + | So here is the Oei DC which is dublin core but you'll also see there' |
- | going to show you how you can turn on | + | |
- | view finds oai-pmh server | + | And as you can see, there' |
- | to the command line and I'm going to | + | |
- | edit my local config | + | So now that we've showed how OAI-PMH server functionality works, let's show what VuFind |
- | you'll see the | + | |
- | in the default configuration that comes | + | So going back to the command line, there is a folder we haven' |
- | with view find the entire | + | |
- | commented out so by deleting this | + | One of the important files under harvest is called oai.ini which is just an inny file that you can use to set up OAI harvesting. So I'm going to copy harvest/oai.ini |
- | semicolon and uncommenting the section | + | |
- | header I have now activated my IP mhm | + | So oai.ini has lots of comments at the top and the many many settings that are supported by this file. Through |
- | sir that's all I need to do to turn on | + | |
- | the basic functionality but there are a | + | When I perform a harvest, I will end up with a local/harvest/you find directory filled with XML files. Now I need to give it the base URL of an OAI server. In this case, that's gonna be HTTP localhost/ |
- | few things here that I would probably | + | |
- | also want to do like give the beam and | + | I also have to provide a metadata prefix telling it what metadata format to harvest and in this example, I actually just want to see what the Dublin |
- | you can set a separate administrative | + | |
- | email for your oai server or otherwise | + | Or, you can tell it the name of a specific section and it will be that one repository. I'll do that. I'll tell it harvest |
- | it will default email address | + | |
- | also some settings related to sets since | + | So now, if I go into my local/harvest/ |
- | oai servers can divide a collection into | + | |
- | specific sets use a solar field like a | + | So, that's all I wanted to show this month. This will become much more interesting when we talk about ingesting XML because you can harvest with OAI and then load a whole directory of records into VuFind. We will look at that next time. In the meantime, I also just wanted to quickly mention that if you want to do this OAI-PMH |
- | facet for defining sets or you can | + | |
- | specify particular | + | //This is an edited version of an automated transcript. Apologies for any errors.// |
- | particular queries associated with them | + | |
- | if you want to allow people to harvest | + | |
- | specific subsets of your collection but | + | |
- | if you just leave all this stuff | + | |
- | commented out then set functionality | + | |
- | will be disabled and people will only be | + | |
- | able to harvest your entire collection | + | |
- | there is another important step though | + | |
- | that you have to take before you can use | + | |
- | oai-pmh server capabilities | + | |
- | find and that is to turn on record | + | |
- | change tracking because the oai-pmh | + | |
- | protocol needs to know the history of | + | |
- | when everything in your system was | + | |
- | created or changed so that it can do | + | |
- | incremental updates | + | |
- | more information at index time so that | + | |
- | the server has the information that it | + | |
- | needs by default | + | |
- | track record change information because | + | |
- | doing so makes the index process slower | + | |
- | but if you do turn this on you not only | + | |
- | get the benefit of being able to use the | + | |
- | oai-pmh server but you also gain access | + | |
- | to some other functionality that | + | |
- | otherwise won't work including RSS feeds | + | |
- | that are sorted based on actual | + | |
- | record creation times and the ability to | + | |
- | use solar-based | + | |
- | new record searching where you can | + | |
- | actually limit your search by how | + | |
- | recently records were added to the index | + | |
- | so to turn this on you just need to | + | |
- | uncomment a couple of lines in the | + | |
- | default | + | |
- | I'm going to bring that up and this is | + | |
- | the same file that we've worked on yes | + | |
- | you can see here near the top there are | + | |
- | two lines first indexed | + | |
- | and just by uncommenting these I turn on | + | |
- | change tracking | + | |
- | these two fields is that the first | + | |
- | indexed | + | |
- | the first I am a particular record ID | + | |
- | was indexed into the system and the last | + | |
- | indexed | + | |
- | recent time that record changed so when | + | |
- | you index a record for the first time | + | |
- | first index and last index will be set | + | |
- | but if that record gets revised over | + | |
- | time last index will change to reflect | + | |
- | those changes but first indexed | + | |
- | always stay the same so you know the age | + | |
- | of the overall record as well as the | + | |
- | date of its most recent change and this | + | |
- | is sort of the minimum amount of | + | |
- | information needed to implement | + | |
- | or simply making a change to my mark | + | |
- | local properties file is not enough I | + | |
- | also index all of my records and just in | + | |
- | keeping with past demos I'm going to | + | |
- | index 3 of the sample | + | |
- | that defined | + | |
- | gyeo dot mark and authority bins and of | + | |
- | course I've showed you how to turn on | + | |
- | change tracking for mark records | + | |
- | point in the future we'll also when we | + | |
- | get that far you can also turn on change | + | |
- | tracking there it's just done in a | + | |
- | different way for now we've got our | + | |
- | index updated the way we need it to be | + | |
- | we have the oai server functionality | + | |
- | turned on config | + | |
- | switch | + | |
- | you how this works so if you go to your | + | |
- | you find URL with slash oai on the end | + | |
- | of it you will get to a convenient page | + | |
- | that shows you all of the verbs | + | |
- | supported by the oai-pmh protocol | + | |
- | you test them out on your instance so | + | |
- | for example the most simple thing you | + | |
- | can do is just say identify which will | + | |
- | dunk out basic information about the | + | |
- | server and as you can see that Damien's | + | |
- | repo repository name I put into config | + | |
- | dot ini comes through here of course | + | |
- | much more interesting is finding out | + | |
- | what kind of metadata formats are | + | |
- | supported by a know AI pmh server | + | |
- | mentioned before they always support | + | |
- | dublin core but different formats may be | + | |
- | supported by different servers so in a | + | |
- | view case I'm just going to give one of | + | |
- | the records and the index and find out | + | |
- | what formats are supported | + | |
- | the Oei DC which is dublin core but | + | |
- | you'll also see there' | + | |
- | supported | + | |
- | and if we wanted to actually see some | + | |
- | records we can use the list records verb | + | |
- | which at a bare minimum requires that we | + | |
- | give it a metadata format | + | |
- | to give it one hit go and there' | + | |
- | response and as you can see there' | + | |
- | mark XML getting dumped out here so by | + | |
- | turning on this functionality you can | + | |
- | share all of the records in your view | + | |
- | find index with other systems Union | + | |
- | catalogs participating in projects like | + | |
- | the digital Public Library of America | + | |
- | and also actually indexing things into | + | |
- | view find so now that we've showed how | + | |
- | oai-pmh server functionality works let's | + | |
- | show what view find can do as an AI pmh | + | |
- | client and actually make it harvest | + | |
- | itself as an example | + | |
- | the command line there is a folder we | + | |
- | haven' | + | |
- | the view find directory and like just | + | |
- | about everything in view find you can | + | |
- | override things from the harvest | + | |
- | directory inside the local harvest | + | |
- | direct so one of the important files | + | |
- | under harvest is called oai ini which is | + | |
- | just an inny file that you can use to | + | |
- | set up Oh a harvesting | + | |
- | copy harvest /oe III and I into local | + | |
- | harvest local copy that I local settings | + | |
- | directory so oai dot I and I has lots of | + | |
- | comments at the top and the many many | + | |
- | settings that are supported by this file | + | |
- | through | + | |
- | bare minimum all you need to do to | + | |
- | perform an OE I harvest is to create a | + | |
- | section | + | |
- | are you find and the main purpose of the | + | |
- | section | + | |
- | harvested will be saved in a directory | + | |
- | whose name matches the section | + | |
- | perform a harvest I will end up with a | + | |
- | local slash harvest | + | |
- | directory filled with XML files now I | + | |
- | need to give it the base URL of an OE I | + | |
- | server | + | |
- | localhost | + | |
- | is the URL that you would share with | + | |
- | others who want to harvest from you | + | |
- | though of course in a real life scenario | + | |
- | the host name would be something other | + | |
- | than localhost I also have to provide a | + | |
- | metadata prefix telling it what metadata | + | |
- | format to harvest and in this example I | + | |
- | actually just want to see what the | + | |
- | Dublin | + | |
- | have your oai and I set up there is a | + | |
- | PHP script called harvest | + | |
- | AI dot PHP and when you run that it will | + | |
- | loop through oai i and i and harvest | + | |
- | every section it volumes | + | |
- | it the name of a specific section and it | + | |
- | will be that one repository I'll do that | + | |
- | I'll tell it harvest | + | |
- | we go it just downloaded 250 Dublin | + | |
- | records in just a couple of seconds | + | |
- | now if I go into my local harvest | + | |
- | find directory and list my files I have | + | |
- | lots and lots of XML files and if I was | + | |
- | out there | + | |
- | is a little bit of Dublin | + | |
- | title and a creator and identifier | + | |
- | that' | + | |
- | month this will become much more | + | |
- | interesting when we talk about ingesting | + | |
- | XML because you can harvest with oai and | + | |
- | then load a whole directory of records | + | |
- | into view finds we will look at that | + | |
- | next time in the meantime I also just | + | |
- | wanted to quickly mention that if you | + | |
- | want to do this o AIP MH harvesting | + | |
- | without having to install all of you | + | |
- | find it has actually been split out into | + | |
- | a separate project called | + | |
- | harvest so you can just check out view | + | |
- | find harvest | + | |
- | version of the script without having to | + | |
- | carry the whole way to view find around | + | |
- | with you and I will include a link to | + | |
- | that project in the notes with the video | + | |
- | that's all for now | + | |
- | thank you and I will provide more | + | |
- | information next month | + | |
---- struct data ---- | ---- struct data ---- | ||
+ | properties.Page Owner : | ||
---- | ---- | ||
videos/oai-pmh_server_and_harvest_functionality.1587411613.txt.gz · Last modified: 2020/04/20 19:40 by demiankatz