Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.1
    • Component/s: Search
    • Labels:
      None

      Description

      The attached patch adds a new Solr core to VuFind which can be used for indexing your local website with the help of Aperture.

      How to Use:

      1.) Make sure you have Aperture installed and configured -- see http://vufind.org/wiki/aperture

      2.) Edit web/conf/webcrawl.ini and list the sitemap.xml files you wish to harvest. (Currently, this module requires sitemap files in order to harvest content).

      3.) Customize import/xsl/VuFindSitemap.php if you want to create special rules for generating facet values based on URLs or metadata within pages.

      4.) Run import/webcrawl.php. This may take quite some time.

      5.) When crawling is done, go to http://vufind_server/vufind/Web/Results -- you can enter a search in the box here.

      General notes:

      1.) This patch was generated against the trunk, r4284. It should be compatible with VuFind 1.1 or 1.2 with minor modifications -- most notably the need to add a protected update() wrapper method to web/sys/Solr.php (to allow access to the private _update from child classes).

      2.) Crawling is currently done by brute force -- we pull down all pages and index them, then delete pages that were indexed prior to the start of the process (to eliminate obsolete/missing pages). This works fine for small or medium websites, but it probably won't scale indefinitely -- we may eventually want to build a more intelligent crawler tool.

      3.) The search form at the Web/Results URL is obviously very crude -- you are expected to customize this page to make it more useful. You might instead wish to create some custom code that integrates the web search option into VuFind's main search box (this has been done at Villanova -- see https://library.villanova.edu/Find for an example).
      1. blueprint-searchbox.patch
        2 kB
        Demian Katz
      2. result.tpl
        0.8 kB
        Demian Katz
      3. TRUNK_tika_xslt_25-09-12.patch
        14 kB
        Ronan McHugh
      4. WebResults.php
        4 kB
        Nathan Tallman
      5. WebResults.tpl
        0.7 kB
        Nathan Tallman
      6. websearch.patch
        79 kB
        Demian Katz
      7. websearch-solr34.patch
        80 kB
        Demian Katz
      8. websearch-solr34b.patch
        83 kB
        Demian Katz
      9. website-for-vufind-1-2.patch
        80 kB
        Demian Katz

        Activity

        Hide
        Nathan Tallman added a comment -
        Service to provide website recommendations when searching catalog. (web/sys/Recommend/WebResults.php)
        Show
        Nathan Tallman added a comment - Service to provide website recommendations when searching catalog. (web/sys/Recommend/WebResults.php)
        Hide
        Nathan Tallman added a comment -
        Blueprint template for website recommendations when searching catalog. (web/interface/themes/blueprint/Search/Recommend/WebResults.tpl)
        Show
        Nathan Tallman added a comment - Blueprint template for website recommendations when searching catalog. (web/interface/themes/blueprint/Search/Recommend/WebResults.tpl)
        Hide
        Nathan Tallman added a comment -
        When implementing WebResults.php and WebResults.tpl, you need to add "default_side_recommend[] = WebResults" to web/conf/searches.ini. All code for written for VuFind 1.3.
        Show
        Nathan Tallman added a comment - When implementing WebResults.php and WebResults.tpl, you need to add "default_side_recommend[] = WebResults" to web/conf/searches.ini. All code for written for VuFind 1.3.
        Hide
        Demian Katz added a comment -
        The attached blueprint-searchbox.patch can be used to integrate the web search into the main search type drop-down in the Blueprint theme.
        Show
        Demian Katz added a comment - The attached blueprint-searchbox.patch can be used to integrate the web search into the main search type drop-down in the Blueprint theme.
        Hide
        Demian Katz added a comment - - edited
        Most of the functionality from this ticket has now been ported to VuFind 2. The only thing I have left out is the "web search in main search options" patch, which is a hack that needs to be addressed in a more flexible way (VUFIND-107 covers some related territory).

        Here are the key commits:

        Basic indexing/search functionality - https://github.com/vufind-org/vufind/commit/bf82a02b39c83af971a4372527bfc06f767a0e61

        Web crawler utility - https://github.com/vufind-org/vufind/commit/933146f6dfc180b5f2abba852a1b4b5d058facc7

        WebResults recommendation module - https://github.com/vufind-org/vufind/commit/20fa198499efa29cae69925ff971931c911584b5

        There is still room to refine the default schema and configuration; I may do more work on this in the near future. However, the functionality of this ticket is now implemented to a point where I feel comfortable marking this as resolved.
        Show
        Demian Katz added a comment - - edited Most of the functionality from this ticket has now been ported to VuFind 2. The only thing I have left out is the "web search in main search options" patch, which is a hack that needs to be addressed in a more flexible way ( VUFIND-107 covers some related territory). Here are the key commits: Basic indexing/search functionality - https://github.com/vufind-org/vufind/commit/bf82a02b39c83af971a4372527bfc06f767a0e61 Web crawler utility - https://github.com/vufind-org/vufind/commit/933146f6dfc180b5f2abba852a1b4b5d058facc7 WebResults recommendation module - https://github.com/vufind-org/vufind/commit/20fa198499efa29cae69925ff971931c911584b5 There is still room to refine the default schema and configuration; I may do more work on this in the near future. However, the functionality of this ticket is now implemented to a point where I feel comfortable marking this as resolved.

          People

          • Assignee:
            Demian Katz
            Reporter:
            Demian Katz
          • Votes:
            3 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: