About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
indexing:deduplication

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
indexing:deduplication [2023/03/20 16:31] – [RecordManager] demiankatzindexing:deduplication [2023/03/20 16:33] (current) – [Search Process] demiankatz
Line 89: Line 89:
 Records with merged_child_boolean=true are filtered from the results during the initial Solr search. Then the preferred original record is selected from each merge record found, and the merge record replaced with the original record. Information on all the records belonging to the dedup group is added to the original records in "dedup_data" field so that this information can be displayed to the user e.g. with links to other records. The preferred record is always first in "dedup_data". Records with merged_child_boolean=true are filtered from the results during the initial Solr search. Then the preferred original record is selected from each merge record found, and the merge record replaced with the original record. Information on all the records belonging to the dedup group is added to the original records in "dedup_data" field so that this information can be displayed to the user e.g. with links to other records. The preferred record is always first in "dedup_data".
  
 +==== Architecture Note: Field Collapsing / Collapse/Expand ====
 +
 +Note that while Solr supports features such as "field collapsing" and "collapse/expand" which could be used to achieve similar deduplication behavior, the deduplication mechanism here does not utilize these features. This avoids the performance cost associated with such functionality, and also allows broader search results. Collapse/expand only works for a search result set. VuFind®'s deduplication doesn't require all the records in the group to match the search terms. It's enough that the merge record does. This may or may not be important depending on how things are done, but at least it allows one to present the "best" result record in search results without having to re-merge anything.
 ===== Configuration ===== ===== Configuration =====
  
indexing/deduplication.1679329868.txt.gz · Last modified: 2023/03/20 16:31 by demiankatz