About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
indexing:solrmarc

SolrMarc

SolrMarc is used to import MARC bibliographic and authority metadata into the Solr instance used by VuFind®. See Solr Index Schema and Authority Control for notes on the layout of the index.

SolrMarc Versions

VuFind®'s changelog can help you determine which version of SolrMarc you are currently running.

:!: Prior to VuFind® 3.1, SolrMarc 2.x was used. With the introduction of SolrMarc 3.x, many performance benefits and enhanced features became available.

Upgrading SolrMarc

Manually upgrading SolrMarc to a later 3.x version is simply a matter of updating import/solrmarc-core.jar and any necessary dependencies found in import/lib.

Customizing Import Mappings

The import process is controlled by the settings in the import/marc.properties file under your VuFind® installation directory. The default settings should be fine for a first-time user, but if you want to change or expand the set of MARC fields that are used to build VuFind®'s search indexes, you can edit this file to make adjustments. See the SolrMarc documentation for details on how this works.

While editing or overriding marc.properties is possible, a preferred technique is to instead add custom settings to import/marc_local.properties. Any lines added to this file will override the equivalent settings in marc.properties. Use of this file is optional, but it is an easy way to separate your local customizations from the default settings packaged with VuFind®. For more details, see local MARC mappings.

You can use dynamic_field_suffixes as part of your custom field names in your marc_local.properties file. This will enable you to add fields to the marc_local.properties file without having to modify schema.xml or restart Solr. To do this, name the custom field with the appropriate suffix for the data type you need. Otherwise, be sure to update schema.xml to define the custom fields, and also restart Solr.

Customizing Translation Maps

One of the features of SolrMarc is the ability to translate values found in MARC into different strings using translation map files (i.e. language_map.properties). See the SolrMarc documentation for details on how to specify a translation map in the marc.properties file.

Translation maps are found in the import/translation_maps directory and can be overridden in the local settings directory.

Customizing Format Determination

One of the most commonly-requested VuFind® customizations involves changing the way record formats are assigned. The method for this depends on your VuFind® version.

VuFind® 6.0 and newer

VuFind® 6.0 replaces the getFormat method with getFormats (to return multiple values when appropriate). The advice below for VuFind® 4.0 and newer remains relevant; just be sure to override the appropriate method.

VuFind® 4.0 and newer

More recent versions of VuFind® determine formats using a getFormat function found in FormatCalculator.java. You can override this file in your local settings directory to adjust the behavior.

  1. Copy import/index_java/src/org/vufind/index/FormatCalculator.java into the import/index_java/src/edu/myuniversity/index (replacing edu/myuniversity with an appropriate domain for your institution) subdirectory of your local settings directory and edit it to customize the behavior as needed. Be sure to adjust the package declaration at the top of the file to match the directory path you created, so you can differentiate your local indexing class from the core one. The MARC access is accomplished with the MARC4J library.
  2. If your new custom script returns different values than the old script, or if you want to change the way the existing values are mapped into your index, edit a local copy of the translation map found in import/translation_maps/format_map.properties.
  3. If SolrMarc has difficulty finding your custom code, you can edit a local copy of import/marc_local.properties to ensure that VuFind® loads the appropriate class. This could look like:
format = custom(edu.myuniversity.index.FormatCalculator), getFormat, format_map.properties

VuFind® 3.x and earlier

By default, formats are generated in older versions of VuFind® using the getFormat function built into SolrMarc. However, the logic used by getFormat is also replicated in a BeanShell script packaged with VuFind®. If you want to customize the behavior, here are the steps to follow:

  1. Copy import/index_scripts/format.bsh into the import/index_scripts subdirectory of your local settings directory and edit it to customize the behavior as needed. BeanShell borrows its syntax and libraries from Java, so the code should look familiar to many developers. The MARC access is accomplished with the MARC4J library.
  2. If your new custom script returns different values than the old script, or if you want to change the way the existing values are mapped into your index, edit a local copy of the translation map found in import/translation_maps/format_map.properties.
  3. Uncomment the format line (format = script(format.bsh), getFormat, format_map.properties) in a local copy of import/marc_local.properties to ensure that VuFind® imports using the custom BeanShell version of getFormat instead of the built-in SolrMarc version.

Custom Indexing Functions

Dynamically-Compiled Java Code

:!: This option is only available in SolrMarc 3.0 and later, but it is usually the preferred option when available. The other three options below are more relevant to SolrMarc 2.x and earlier, and the documentation links they provide may be outdated.

By putting custom Java code in the import/index_java/src subdirectory (either under $VUFIND_HOME or in your local settings directory), you can define custom methods that will be automatically compiled at runtime. This offers the benefits of BeanShell scripting with none of the disadvantages. See the documentation for more details.

See Custom Java Best Practices for some suggestions.

Compiled Custom Functions

Sometimes, it is necessary to perform special data manipulation beyond the capabilities of the built-in SolrMarc functions. See this page for details on writing custom Java code to extend the capabilities of SolrMarc.

BeanShell Scripts

Starting with VuFind® 1.0RC2, it is also possible to write custom indexing functions as BeanShell scripts. This allows you to extend SolrMarc without having to rebuild the entire Java package. You can simply add scripts to the import/scripts directory and call them from your marc.properties file; here's an example to show the syntax:

format = script(format.bsh), getFormat, format_map.properties

More details can be found here.

SolrMarc Mixins

Starting with VuFind® 1.4, you can also write compiled Java mixin objects to define custom functionality. These should offer better performance than BeanShell, though they are slightly more complex to implement. See the README in the mixin development kit at the SolrMarc download page for more details.

Customizing Record IDs

It is sometimes useful to add a prefix to a record ID (for example, if you are importing numeric IDs from multiple systems and want to prevent collisions). This can be achieved through a regular expression trick in import/marc_local.properties:

id = 001, (pattern_map.id_prefix), first
pattern_map.id_prefix.pattern_0 = (.+)=>bib_$1

(just replace “bib_” in the second line with the prefix you desire).

Pre-Processing Records

It may occasionally be useful to manipulate records after you export them from your ILS but before you load them into SolrMarc. See the Code4Lib Working with MARC page for some tools to help you with MARC manipulation.

Troubleshooting

See the troubleshooting page for more details.

indexing/solrmarc.txt · Last modified: 2024/02/23 11:34 by demiankatz