Warning: This page has not been updated in over over a year and may be outdated or deprecated.
configuration:remote_marc_records
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revisionLast revisionBoth sides next revision | ||
remote_marc_records [2015/05/13 13:02] – created demiankatz | configuration:remote_marc_records [2019/02/04 20:36] – [2. Modify marc_local.properties] demiankatz | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Remote MARC Records ====== | ====== Remote MARC Records ====== | ||
+ | Swap MARC-fullrecord to external service using RecordDriver SolrMarcRemote | ||
+ | |||
+ | // IMPORTANT: This page refers to a feature added in VuFind 2.5. // | ||
+ | |||
+ | ===== Introduction ===== | ||
+ | |||
+ | We (finc-team at Leipzig University Library) have been struggling with a huge index for quite a while and mitigated it by moving the index' | ||
+ | |||
+ | As the fullrecord-field is not indexed we concluded that it could also be removed from the index and made available through a binary-server. In fact our setup stores the indexed .mrc-files in a simple folder-structure made up of two digit folder-names corresponding with the .mrc-file' | ||
+ | |||
+ | e.g. / | ||
+ | |||
+ | By usage of HTTP-GET-Requests the .mrc-files are being served. We extended the SolrMarc-RecordDriver by implementing a method that gets the binary .mrc-file from an URL configured in the [Record] section in config.ini if the fullrecord-field is empty or non-set in the current index. | ||
+ | All the stock-methods of SolrMarc-RecordDriver for parsing binary MARC-data still work and are used by the new RecordDriver SolrMarcRemote. | ||
+ | |||
+ | Therefore if you are in need of reducing size of your Solr-Index, swapping the MARC-fullrecord to a remote service might be a solution for you. | ||
+ | |||
+ | ===== Prerequisites ===== | ||
+ | |||
+ | In this guide it is assumed that | ||
+ | |||
+ | * your server operating system is Linux | ||
+ | * you have an additional http-server running which will be used for serving the MARC-files (e.g. nginx) | ||
+ | * you have direct access to the MARC-files serving http-server' | ||
+ | * your MARC-records reside in one single .mrc file | ||
+ | * your unique identifier for the records (and for the MARC-files) consists of 10 digits (those will be sliced into chunks of two digits being used for the folder-structure) - if your unique identifier for your MARC-records differs from ours you will need to adjust the folder-structure and slicing logic accordingly | ||
+ | |||
+ | ===== Setup-Guide ===== | ||
+ | |||
+ | ==== 1. Setup the remote service providing MARC-files ==== | ||
+ | |||
+ | We use nginx for our http-server, | ||
+ | |||
+ | < | ||
+ | user nginx; | ||
+ | worker_processes | ||
+ | location / { | ||
+ | rewrite " | ||
+ | root / | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | The corresponding files will be placed in / | ||
+ | |||
+ | ==== 2. Modify marc_local.properties ==== | ||
+ | |||
+ | Set in your marc_local.properties: | ||
+ | |||
+ | <code properties> | ||
+ | fullrecord = "" | ||
+ | record_format = " | ||
+ | </ | ||
+ | |||
+ | :!: In VuFind 5.x and earlier, use " | ||
+ | |||
+ | This will prevent import-marc.sh from loading the MARC-Record into the Solr-field fullrecord and mark the Solr-records as the type " | ||
+ | |||
+ | ==== 3. Populate remote service with MARC-files during indexing ==== | ||
+ | |||
+ | The following shell-script (linux bash) is a proof-of-concept that needs to be adapted by your needs (e.g. your unique identifier/ | ||
+ | |||
+ | <code bash> | ||
+ | #!/bin/bash | ||
+ | tmpfix="/ | ||
+ | yaz-marcdump -s " | ||
+ | for source in $(ls -1 " | ||
+ | do | ||
+ | # extract MARC 001 (pos 4+), insert slash after every other char (..), replace trailing slash by extension .mrc | ||
+ | target=$(yaz-marcdump " | ||
+ | # create target directory path (up to last slash) | ||
+ | mkdir -p $(echo " | ||
+ | # rename/move marc file | ||
+ | mv " | ||
+ | done | ||
+ | </ | ||
+ | |||
+ | This script | ||
+ | |||
+ | * uses yaz-marcdump to extract your MARC-records from the single MARC-file containing several records | ||
+ | * uses sed, mv to create the folder-structure | ||
+ | * does not check whether the id found in MARC-001 is appropriate for the folder-structure | ||
+ | |||
+ | ==== 4. Configure RecordDriver SolrMarcRemote ==== | ||
+ | |||
+ | Turn on the appropriate setting in the [Record] section of [[configuration: | ||
+ | |||
+ | <code ini> | ||
+ | remote_marc_url = http:// | ||
+ | </ | ||
+ | |||
+ | ===== Conclusion ===== | ||
+ | |||
+ | This setup should have reduced the size of your Solr index and populated your additional http-server with the raw MARC-files. VuFind should load all records flawlessly by pulling the raw MARC-files if they are needed from the additional http-server. | ||
+ | If you have questions regarding this setup please fell free to contact us: < | ||
+ | |||
---- struct data ---- | ---- struct data ---- | ||
---- | ---- | ||
configuration/remote_marc_records.txt · Last modified: 2023/11/09 19:13 by demiankatz