About Features Downloads Getting Started Documentation Events Support GitHub

Love VuFind®? Consider becoming a financial supporter. Your support helps build a better VuFind®!

Site Tools


Warning: This page has not been updated in over over a year and may be outdated or deprecated.
indexing:websites

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
indexing:websites [2015/12/14 20:16] – ↷ Links adapted because of a move operation demiankatzindexing:websites [2021/08/03 13:49] (current) demiankatz
Line 6: Line 6:
  
   - Make sure that you have a [[full_text_tools|full text extraction tool]] installed and configured.   - Make sure that you have a [[full_text_tools|full text extraction tool]] installed and configured.
-  - Enable the website core by editing solr/solr.xml and uncommenting the appropriate line. 
-  - [[administration:starting_and_stopping_solr#restarting_solr_manually|Restart Solr]]. 
   - Copy config/vufind/webcrawl.ini into the config/vufind subdirectory of your [[configuration:local_settings_directory|local settings directory]] and edit the file to specify where your website's XML sitemap lives.   - Copy config/vufind/webcrawl.ini into the config/vufind subdirectory of your [[configuration:local_settings_directory|local settings directory]] and edit the file to specify where your website's XML sitemap lives.
   - Run the import/webcrawl.php tool to load your website's data into the index (this may take a long time).   - Run the import/webcrawl.php tool to load your website's data into the index (this may take a long time).
   - When crawling is done, go to <nowiki>http://vufind_server/vufind/Web/Results</nowiki> -- you can enter a search in the box here.   - When crawling is done, go to <nowiki>http://vufind_server/vufind/Web/Results</nowiki> -- you can enter a search in the box here.
 +
 +(//In very old versions of VuFind -- earlier than release 3.0 -- you will need to enable the website core by editing solr/solr.xml and uncommenting the appropriate line, then [[administration:starting_and_stopping_solr#restarting_solr_manually|restart Solr]], before running the webcrawl.php tool//).
 +
 ===== Customizing the Web Search ===== ===== Customizing the Web Search =====
  
Line 22: Line 23:
  
   * The current webcrawl.php tool works very much by brute force; we may want to build a more intelligent, flexible crawler at some point in the future.   * The current webcrawl.php tool works very much by brute force; we may want to build a more intelligent, flexible crawler at some point in the future.
 +
 +===== Related Video =====
 +
 +You can learn more about web indexing through the [[videos:sitemaps_and_web_indexing|Sitemaps and Web Indexing]] video.
 ---- struct data ---- ---- struct data ----
 +properties.Page Owner : 
 ---- ----
  
indexing/websites.1450124219.txt.gz · Last modified: 2015/12/14 20:16 by demiankatz