Because the MARC representation of formats is complex and can be very fine grained, we elected to begin with a list of formats and to devise the alogrithms for these formats. We also used “other” for everything that didn't get a format. We have been whittling down the number of resources classified as “other” by examining the resources with this value (yay facets!) and determining what changes we should make to our existing algorithms, or what additional format categories we want to add to our original list.
We have made a separate facet for access method with the following values:
This means we do not assign “online” as a format facet value.
Because a single bib record can describe multiple manifestations that may take separate forms, we consider format to be multivalued.
Note: Leader/06 refers to the 7th character of the Leader, as MARC21 uses 0 based character positions.
| format value | algorithm |
|---|---|
| Audio - Non-Music | Leader/06 is “i” |
| Book | Leader/06 is “a” or “t” AND Leader/07 is “a” or “m” |
| Data | Leader/06 is “m” AND 008/26 is”a” |
| Image | Leader/06 is “k” AND 008/33 is “i” “k” “p” “s” or “t” |
| Instructional Kit | Leader/06 is “o” |
| Journal | (Leader/07 is “s” OR 006/00 is “s”) AND 008/21 is “p” |
| Manuscript/Archive | Leader/06 is “b” or “p” |
| Map/Globe | Leader/06 is “e” or “f” |
| Microfilm | 245 subfield h contains “microform” |
| Music - Audio | Leader/06 is “j” |
| Music - Score | Leader/06 is “c” or “d” |
| Newspaper | Leader/06 is “a” AND Leader/07 is “s” and 008/21 is n |
| Object | Leader/06 is “r” |
| Thesis | presence of 502 field |
| Video | Leader/06 is “g” AND 008/33 is “m” or “v” |
| Other | any record that does not have any of the above formats assigned |
This is implemented as a custom method in solrmarc. The algorithm is a little bit more complex, due to the possibility of multiple formats.
We are still refining our format categories and algorithms as of this writing, 2008-08-11.