About Features Downloads Getting Started Documentation Events Support GitHub

Site Tools


indexing:adding_facets

Adding Facets

These directions are based on the excellent documentation provided by the Solrmarc project. Below is a simplified version. For more detailed information about custom indexing and translation, See the Solrmarc wiki.

Adding a facet to the Narrow Search box is a relatively straightforward procedure. Because facets rely on an index field, these instructions assume we are beginning with one of three possible scenarios:

1. The index field to be used already exists

2. No index field currently exists; the data in the MARC record will be indexed directly (i.e. data is already a text string with no need or desire for normalization)

3. No index field currently exists; data is encoded and will need to be translated into text strings first

Index field exists

Example used: Adding the date of publication as a facet

First verify existence of the index field and make sure it contains the data you want. $VUFIND_HOME/import/marc.properties contains the default mapping of MARC fields to the Solr index. Each line of the file populates a single index field. The name of the Solr index field is before the equal sign. Immediately following the equal sign is one or more combinations of the three digit MARC tag number and any subfields that are indexed together. A colon separates different MARC fields or subfields to be indexed separately. A number in brackets indicates the byte-position to be indexed. “First” at the end of the line means only the first value will be indexed.

In our example, the publishDate index is a custom index not pulled directly from a MARC tag. The DateOfPublication custom indexing routine provided the date to the indexer, but we do not need to know how it does that to proceed. We have identified that the index in question is called publishDate, which is all we need right now.

The file facets.ini contains the lists of facets to be viewed. Each line is a facet, in the form:

 SolrIndexName = Facet Display Name

(In a clean installation, this includes the first two facets, Institution and Library, which may not be needed. They can be commented out by inserting a semicolon at the beginning of the line). To add publication date as a facet, simply add the following line to the file:

 publishDate = Publication Year

*Note that the facets are displayed in order; if publishDate is added at the bottom of the list, it will appear at the bottom of the Narrow Search box below the rest.

That's it. Changes are immediately reflected on the search results page.

If, due to typos, etc, a non-existent index field is added to the list, this may break the Narrow Search box completely and prevent facets from displaying.

Index field does not exist, no translation needed

Example: a customized genre facet based on the genre heading fields and subfields

If the desired facet is not already an existing index field, we must first create the field. This will require re-indexing of all of your records.

The file $VUFIND_HOME/import/marc.properties maps MARC fields and subfields to an index. For a more detailed description of the file format than provided above, see the SolrMarc documentation.

In this example, we want to use the strings found in the 655 subfield a, and the subfield v data from the 650, 651, and 600 fields. The index will be called “allgenre”. To do this, we add the following line to marc.properties:

 allgenre = 655ab:650v:600v:651v

We now need to tell Solr what to do with the new index. File $VUFIND_HOME/solr/vufind/biblio/conf/schema.xml defines the fields in Solr. In the <fields> section, we add

<field name="allgenre" type="textFacet" indexed="true" stored="true" multiValued="true" termVectors="true"/>

After the database has been re-indexed, the allgenre index field will exist. The facet can be added to the facets.ini file as described above.

A Useful Shortcut -- Dynamic Fields

If you are using VuFind 1.3 or newer, you can take advantage of dynamic Solr fields to avoid modifying your schema. VuFind is configured to recognize certain field suffixes and treat them as new fields without requiring explicit definitions. In the example above, you could use the field name allgenre_txtF_mv instead of allgenre in marc.properties and skip the schema.xml step.

See this page for all of the available suffixes.

A Best Practice -- Local Settings Directory

While the example above suggests editing marc.properties directly, VuFind 2.0 and later supports a local settings directory which isolates your custom changes from the core VuFind code. If you put your customizations in $VUFIND_LOCAL_DIR/import/marc_local.properties instead of $VUFIND_HOME/import/marc.properties, then the lines in marc_local.properties will override the equivalent lines in marc.properties. This allows you to specify your changes in one place, but makes it easier to upgrade changes and additions to the core configuration during future upgrades.

See local MARC mappings for more details.

Index field does not exist, translation needed

Example: Instrument types for music

For encoded data (such as data found in the 007, 008, or several 04X fields), we must first map the data to text strings. Luckily, the MARC format is well documented and lists of what each code means are readily available on the MARC Code Lists and at OCLC's Formats and Standards page.

Create a text file in $VUFIND_LOCAL_DIR/import/translation_maps and name it “something.properties”. In this case, I have created the file $VUFIND_LOCAL_DIR/import/translation_maps/instrument_map.properties to contain the mapping. The file will translate the two-letter codes used in the MARC 048 field into readable text. Each line of the file contains a single possible code and its translation. Example:

 ka = Piano
 kb = Organ
 kc = Harpsichord
 kd = Clavichord

Etc.

In $VUFIND_HOME/import/marc.properties (or the local MARC mappings file, if you want to separate your local changes from the defaults provided by VuFind), we add the following line:

 instrument_facet = 048a[0-1], instrument_map.properties

Of note: The numbers in brackets indicate that the system should look at only the first two bytes in the 048 subfield a field (0-1 mean position 0 to position 1). A comma separates the field information from the name of the file used to translate the data, in this case, instrument_map.properties.

A line defining the new index field must be added to Solr's schema.xml file (usually found in $VUFIND_HOME/solr/vufind/biblio/conf) and a line for the new facet will be added to facets.ini (see above for instructions).

Troubleshooting

string vs. text fields

If you set up a facet field and see individual words instead of complete facet values, this most likely means that you have faceted on an analyzed field (usually of type “text” in VuFind's Solr schema). Solr faceting displays the terms stored in the index, not the original raw text provided at index-time. Thus, if you facet on an analyzed field that tokenizes and manipulates strings, strange facet values may appear to the end user. Most of the time, you only want to facet on simple string fields to avoid this problem. This is why the default schema includes some apparently duplicate values – it is generally necessary to use different fields for search-oriented and facet-oriented tasks.

indexing/adding_facets.txt · Last modified: 2018/05/11 11:28 by demiankatz