Indexing a Website
Starting with release 2.1, VuFind can be used to create a website index separate from your main search index. Results from this index can then be used on their own or merged with catalog results using the combined search tools.
- Make sure that you have a full text extraction tool installed and configured.
- Enable the website core by editing solr/solr.xml and uncommenting the appropriate line.
- Copy config/vufind/webcrawl.ini into the config/vufind subdirectory of your local settings directory and edit the file to specify where your website's XML sitemap lives.
- Run the import/webcrawl.php tool to load your website's data into the index (this may take a long time).
- When crawling is done, go to http://vufind_server/vufind/Web/Results – you can enter a search in the box here.
Customizing the Web Search
Several things can be modified (with the help of your local settings directory) to adjust web search behavior and appearance.
- You can customize the way web pages are indexed by creating a custom version of import/xsl/sitemap.xsl and/or import/sitemap.properties.
- You can customize search behavior and options through config/vufind/website.ini and config/vufind/websearchspecs.yaml.
- You can customize display behavior through the VuFind\RecordDriver\SolrWeb record driver and corresponding templates.
- The current webcrawl.php tool works very much by brute force; we may want to build a more intelligent, flexible crawler at some point in the future.