About Features Downloads Getting Started Documentation Events Support GitHub

Site Tools


indexing:dspace

How to index DSpace with VuFind

These are the instructions used by the Naval Postgraduate School in Monterey, California to index DSpace records in VuFind.

:!: These instructions were written for VuFind 2.x or newer; See this page for VuFind 1.x

1. Turn on OAI-PMH in DSpace

OAI must be enabled on the DSpace repository first:

  1. Modify the DSpace server config in nginx.conf on the DSpace server:
    Location /oai/ {
        Proxy_set_header X-Forwarded-Host $host;
        Proxy_set_header X-Forwarded-Server $host;
        Proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       
        Proxy_pass http://yourdspacehostname:8080/oai/;
        Proxy_redirect http://yourdspacehostname:8080/oai/  http://yourdspacehostname/oai;
    
        Proxy_buffering off;
        Proxy_store off;
    
        Proxy_connect_timeout 120;
        Proxy_send_timeout 120;
        Proxy_read_timeout 120;
    }

    Comparable configuration in Apache makes use of mod_proxy. Note that the proxy configuration is only necessary if you are unable to open port 8080 to your VuFind instance. If you are not limited by such restrictions, feel free to use your full DSpace hostname appended with “:8080” and skip the above proxy configuration.

  2. Modify the server.xml for the appropriate DSpace Tomcat instance in the HOST block:
    <Context path="/oai" docBase="/path_to_dspace/webapps/oai" debug="0"
        Reloadable="true" cachingAllowed="false"
        allowLinking="true" />
  3. Modify the dspace.conf config file for the appropriate DSpace instance:
    ...
    harvest.includerestricted.oai = true
    harvester.autoStart = true
    ...

2. Import records into VuFind using OAI-PMH harvest

These steps use VuFind's OAI-PMH harvest tool. You can learn more about it on this page.

  1. Modify $VUFIND_LOCAL_DIR/harvest/oai.ini
    [DSpace]
    url=http://yourdspacehostname/oai/request
    metadataPrefix=oai_dc
    idSearch[]="/^oai:yourdspacehostname:/"
    idReplace[]="ir-"
    idSearch[]="/\//"
    idReplace[]="-"
    injectDate="datestamp"
    injectId="identifier"
    dateGranularity=auto
    harvestedIdLog=harvest.log
  2. Run these commands:
    cd $VUFIND_HOME/harvest
    php harvest_oai.php
    ./batch-import-xsl.sh DSpace dspace.properties

3. Customize Import Rules (optional)

If you wish to customize the way your records are ingested, see the indexing XML page for details. The instructions above use the example dspace.properties and dspace.xsl files that ship with VuFind. You can modify these as needed to change the way data is indexed.

:!: If you change import rules, note that you will need to remove your $VUFIND_LOCAL_DIR/harvest/DSpace directory, re-harvest the records, and repeat the indexing process in step 2 above.

4. Customize Record Display (optional)

By default, VuFind does not include any DSpace-specific display logic; records indexed from DSpace are displayed using the standard “SolrDefault” record driver and templates. However, the default import setup marks DSpace records with a recordtype value of “dspace” which means that you can create a custom record driver named SolrDspace in order to create custom DSpace-only display options. See displaying_a_custom_field for some examples of record display customization.

indexing/dspace.txt · Last modified: 2017/04/21 08:02 by demiankatz