About Features Downloads Getting Started Documentation Events Support GitHub

Site Tools


indexing:solrmarc

SolrMarc

SolrMarc is used to import MARC metadata to the “biblio” index on Solr used by VuFind. See Solr Index Schema for notes on the layout of the index.

SolrMarc Versions

VuFind's changelog can help you determine which version of SolrMarc you are currently running.

:!: Prior to VuFind 3.1, SolrMarc 2.x was used. With the introduction of SolrMarc 3.x, many performance benefits and enhanced features became available.

Users are strongly encouraged to upgrade to VuFind 3.1 or later in order to take advantage of the improved indexing tool. Manually installing the new SolrMarc in an earlier version of VuFind should also be possible with relatively minimal work by replacing the necessary .jar files and adapting the revised indexing scripts found in later releases.

Manually upgrading SolrMarc to a later 3.x version is simply a matter of updating import/solrmarc-core.jar and any necessary dependencies found in import/lib.

Customizing Import Mappings

The import process is controlled by the settings in the import/marc.properties file under your VuFind installation directory. The default settings should be fine for a first-time user, but if you want to change or expand the set of MARC fields that are used to build VuFind's search indexes, you can edit this file to make adjustments. See the SolrMarc documentation for details on how this works.

As of VuFind 1.0RC2, a second properties file is also available called import/marc_local.properties. Any lines added to this file will override the equivalent settings in marc.properties. Use of this file is optional, but it is an easy way to separate your local customizations from the default settings packaged with VuFind. For more details, see local MARC mappings.

You can use dynamic_field_suffixes as part of your custom field names in your marc_local.properties file. This will enable you to add fields to the marc_local.properties file without having to modify schema.xml or restart Solr. To do this, name the custom field with the appropriate suffix for the data type you need. Otherwise, be sure to update schema.xml to define the custom fields, and also restart Solr.

Customizing Translation Maps

One of the features of SolrMarc is the ability to translate values found in MARC into different strings using translation map files (i.e. language_map.properties). See the SolrMarc documentation for details on how to specify a translation map in the marc.properties file.

Starting with VuFind 1.0RC2, translation maps are found in the import/translation_maps directory.

Starting with VuFind 2.x, translation maps can be overridden in the local settings directory.

Prior to RC2, translation maps are embedded in the SolrMarc .jar file, making them more difficult (but certainly not impossible) to modify.

Customizing Format Determination

One of the most commonly-requested VuFind customizations involves changing the way record formats are assigned.

By default, formats are generated using the getFormat function built into SolrMarc. However, the logic used by getFormat is also replicated in a BeanShell script packaged with VuFind. If you want to customize the behavior, here are the steps to follow:

  1. Copy import/index_scripts/format.bsh into the import/index_scripts subdirectory of your local settings directory and edit it to customize the behavior as needed. BeanShell borrows its syntax and libraries from Java, so the code should look familiar to many developers. The MARC access is accomplished with the MARC4J library.
  2. If your new custom script returns different values than the old script, or if you want to change the way the existing values are mapped into your index, edit a local copy of the translation map found in import/translation_maps/format_map.properties.
  3. Uncomment the format line (format = script(format.bsh), getFormat, format_map.properties) in a local copy of import/marc_local.properties to ensure that VuFind imports using the custom BeanShell version of getFormat instead of the built-in SolrMarc version.

Custom Indexing Functions

Dynamically-Compiled Java Code

:!: This option is only available in SolrMarc 3.0 and later, but it is usually the preferred option when available. The other three options below are more relevant to SolrMarc 2.x and earlier, and the documentation links they provide may be outdated.

By putting custom Java code in the import/index_java/src subdirectory (either under $VUFIND_HOME or in your local settings directory), you can define custom methods that will be automatically compiled at runtime. This offers the benefits of BeanShell scripting with none of the disadvantages. See the documentation for more details.

See Custom Java Best Practices for some suggestions.

Compiled Custom Functions

Sometimes, it is necessary to perform special data manipulation beyond the capabilities of the built-in SolrMarc functions. See this page for details on writing custom Java code to extend the capabilities of SolrMarc.

BeanShell Scripts

Starting with VuFind 1.0RC2, it is also possible to write custom indexing functions as BeanShell scripts. This allows you to extend SolrMarc without having to rebuild the entire Java package. You can simply add scripts to the import/scripts directory and call them from your marc.properties file; here's an example to show the syntax:

format = script(format.bsh), getFormat, format_map.properties

More details can be found here.

SolrMarc Mixins

Starting with VuFind 1.4, you can also write compiled Java mixin objects to define custom functionality. These should offer better performance than BeanShell, though they are slightly more complex to implement. See the README in the mixin development kit at the SolrMarc download page for more details.

Customizing Record IDs

It is sometimes useful to add a prefix to a record ID (for example, if you are importing numeric IDs from multiple systems and want to prevent collisions). This can be achieved through a regular expression trick in import/marc_local.properties:

id = 001, (pattern_map.id_prefix), first
pattern_map.id_prefix.pattern_0 = (.+)=>bib_$1

(just replace “bib_” in the second line with the prefix you desire).

Pre-Processing Records

It may occasionally be useful to manipulate records after you export them from your ILS but before you load them into SolrMarc. See the Code4Lib Working with MARC page for some tools to help you with MARC manipulation.

indexing/solrmarc.txt · Last modified: 2017/01/23 16:24 by demiankatz