About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
indexing:websites

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
indexing:websites [2015/12/14 17:08] – [Customizing the Web Search] demiankatzindexing:websites [2021/08/03 13:49] (current) demiankatz
Line 1: Line 1:
 ====== Indexing a Website ====== ====== Indexing a Website ======
  
-Starting with release 2.1, VuFind can be used to create a website index separate from your main search index.  Results from this index can then be used on their own or merged with catalog results using the [[vufind2:combining_search_types|combined search]] tools.+Starting with release 2.1, VuFind can be used to create a website index separate from your main search index.  Results from this index can then be used on their own or merged with catalog results using the [[configuration:combining_search_types|combined search]] tools.
  
 ===== Getting Started ===== ===== Getting Started =====
  
-  - Make sure that you have a [[..:aperture|full text extraction tool]] installed and configured+  - Make sure that you have a [[full_text_tools|full text extraction tool]] installed and configured. 
-  - Enable the website core by editing solr/solr.xml and uncommenting the appropriate line. +  - Copy config/vufind/webcrawl.ini into the config/vufind subdirectory of your [[configuration:local_settings_directory|local settings directory]] and edit the file to specify where your website's XML sitemap lives.
-  - [[..:starting_and_stopping_vufind#restarting_vufind_manually|Restart Solr]]+
-  - Copy config/vufind/webcrawl.ini into the config/vufind subdirectory of your [[vufind2:local_settings_directory|local settings directory]] and edit the file to specify where your website's XML sitemap lives.+
   - Run the import/webcrawl.php tool to load your website's data into the index (this may take a long time).   - Run the import/webcrawl.php tool to load your website's data into the index (this may take a long time).
   - When crawling is done, go to <nowiki>http://vufind_server/vufind/Web/Results</nowiki> -- you can enter a search in the box here.   - When crawling is done, go to <nowiki>http://vufind_server/vufind/Web/Results</nowiki> -- you can enter a search in the box here.
 +
 +(//In very old versions of VuFind -- earlier than release 3.0 -- you will need to enable the website core by editing solr/solr.xml and uncommenting the appropriate line, then [[administration:starting_and_stopping_solr#restarting_solr_manually|restart Solr]], before running the webcrawl.php tool//).
 +
 ===== Customizing the Web Search ===== ===== Customizing the Web Search =====
  
-Several things can be modified (with the help of your [[vufind2:local_settings_directory|local settings directory]]) to adjust web search behavior and appearance.+Several things can be modified (with the help of your [[configuration:local_settings_directory|local settings directory]]) to adjust web search behavior and appearance.
  
   * You can customize the way web pages are indexed by creating a custom version of import/xsl/sitemap.xsl and/or import/sitemap.properties.   * You can customize the way web pages are indexed by creating a custom version of import/xsl/sitemap.xsl and/or import/sitemap.properties.
Line 22: Line 23:
  
   * The current webcrawl.php tool works very much by brute force; we may want to build a more intelligent, flexible crawler at some point in the future.   * The current webcrawl.php tool works very much by brute force; we may want to build a more intelligent, flexible crawler at some point in the future.
 +
 +===== Related Video =====
 +
 +You can learn more about web indexing through the [[videos:sitemaps_and_web_indexing|Sitemaps and Web Indexing]] video.
 ---- struct data ---- ---- struct data ----
 +properties.Page Owner : 
 ---- ----
  
indexing/websites.1450112938.txt.gz · Last modified: 2015/12/14 17:08 by demiankatz