====== Indexing a Website ====== Starting with release 2.1, VuFind can be used to create a website index separate from your main search index. Results from this index can then be used on their own or merged with catalog results using the [[configuration:combining_search_types|combined search]] tools. ===== Getting Started ===== - Make sure that you have a [[full_text_tools|full text extraction tool]] installed and configured. - Copy config/vufind/webcrawl.ini into the config/vufind subdirectory of your [[configuration:local_settings_directory|local settings directory]] and edit the file to specify where your website's XML sitemap lives. - Run the import/webcrawl.php tool to load your website's data into the index (this may take a long time). - When crawling is done, go to http://vufind_server/vufind/Web/Results -- you can enter a search in the box here. (//In very old versions of VuFind -- earlier than release 3.0 -- you will need to enable the website core by editing solr/solr.xml and uncommenting the appropriate line, then [[administration:starting_and_stopping_solr#restarting_solr_manually|restart Solr]], before running the webcrawl.php tool//). ===== Customizing the Web Search ===== Several things can be modified (with the help of your [[configuration:local_settings_directory|local settings directory]]) to adjust web search behavior and appearance. * You can customize the way web pages are indexed by creating a custom version of import/xsl/sitemap.xsl and/or import/sitemap.properties. * You can customize search behavior and options through config/vufind/website.ini and config/vufind/websearchspecs.yaml. * You can customize display behavior through the VuFind\RecordDriver\SolrWeb record driver and corresponding templates. ===== Notes ===== * The current webcrawl.php tool works very much by brute force; we may want to build a more intelligent, flexible crawler at some point in the future. ===== Related Video ===== You can learn more about web indexing through the [[videos:sitemaps_and_web_indexing|Sitemaps and Web Indexing]] video.