VuFindSitemap
extends VuFind
in package
XSLT support class -- all methods of this class must be public and static; they will be automatically made available to your XSL stylesheet for use with the php:function() function.
Tags
Table of Contents
- ISO8601_FORMAT = 'Y-m-d\\TH:i:s\\Z'
- ISO8601 date format string
- $serviceLocator : ServiceLocatorInterface
- Service locator
- arrayToSolrXml() : string
- Convert an associative array of fields into a Solr document.
- explode() : DOMDocument
- Proxy the explode PHP function for use in XSL transformation.
- extractBestDateOrRange() : string
- Try to find the best single year or date range in a set of DOM elements.
- extractEarliestYear() : string
- Try to find a four-digit year in a set of DOM elements.
- getApertureCommand() : string
- Generic method for building Aperture Command
- getChangeTracker() : ChangeTrackerServiceInterface
- Get the change tracker service object.
- getConfig() : Config
- Get a configuration file.
- getDocument() : string
- Harvest the contents of a document file (PDF, Word, etc.) using Aperture.
- getFirstIndexed() : string
- Get the date/time of the first time this record was indexed.
- getLastIndexed() : string
- Get the date/time of the most recent time this record was indexed.
- getParser() : string
- Read parser method from fulltext.ini
- getTikaCommand() : array<string|int, mixed>
- Generic method for building Tika command
- harvestTextFile() : string
- Harvest the contents of a text file for inclusion in the output.
- harvestWithAperture() : string
- Harvest the contents of a document file (PDF, Word, etc.) using Aperture.
- harvestWithParser() : string
- Call parsing method based on parser setting in fulltext.ini
- harvestWithTika() : string
- Harvest the contents of a document file (PDF, Word, etc.) using Tika.
- implode() : string
- Proxy the implode PHP function for use in XSL transformation.
- invertName() : string
- Invert "Firstname Lastname" authors into "Lastname, Firstname."
- invertNames() : DOMDocument
- Call invertName on all matching elements; return a DOMDocument with a name tag for each inverted name.
- isInvertedName() : bool
- Is the provided name inverted ("Last, First") or not ("First Last")?
- mapString() : string
- Map string using a config file from the translation_maps folder.
- removeOuterBrackets() : string
- Remove single square bracket characters if they are the start and/or end chars (matched or unmatched) and are the only square bracket chars in the string.
- removeTagAndReturnXMLasText() : string
- Remove a given tag from the provided nodes, then convert into XML and return as text. This is useful for populating the fullrecord field with the raw input XML but allow for removal of certain elements (eg: full text field).
- setServiceLocator() : void
- Set the service locator.
- solrMarcStyleCleanData() : string
- Port of logic from SolrMarc's DataUtil::cleanData method.
- stripAccents() : string
- Strip accents from a string.
- stripArticles() : string
- Strip articles from the front of the text (for creating sortable titles).
- stripBadChars() : string
- Strip illegal XML characters from a string.
- stripPunctuation() : string
- Strip punctuation from a string.
- titleSortLower() : string
- Perform text processing roughly equivalent to SolrMarc's titleSortLower feature to allow consistent indexing into the title_sort field.
- xmlAsText() : string
- Convert provided nodes into XML and return as text. This is useful for populating the fullrecord field with the raw input XML.
- getApertureFields() : array<string|int, mixed>
- Load metadata about an HTML document using Aperture.
- getDocumentFieldArray() : array<string|int, mixed>
- Support method for getDocument() -- retrieve associative array of field data.
- getHtmlFields() : array<string|int, mixed>
- Extract key metadata from HTML.
- getTikaFields() : array<string|int, mixed>
- Load metadata about an HTML document using Tika.
Constants
ISO8601_FORMAT
ISO8601 date format string
protected
string
ISO8601_FORMAT
= 'Y-m-d\\TH:i:s\\Z'
Properties
$serviceLocator
Service locator
protected
static ServiceLocatorInterface
$serviceLocator
Methods
arrayToSolrXml()
Convert an associative array of fields into a Solr document.
public
static arrayToSolrXml(array<string|int, mixed> $fields) : string
Parameters
- $fields : array<string|int, mixed>
-
Field data
Return values
string —explode()
Proxy the explode PHP function for use in XSL transformation.
public
static explode(string $delimiter, string $string) : DOMDocument
Parameters
- $delimiter : string
-
Delimiter for splitting $string
- $string : string
-
String to split
Return values
DOMDocument —extractBestDateOrRange()
Try to find the best single year or date range in a set of DOM elements.
public
static extractBestDateOrRange(array<string|int, mixed> $input) : string
Best is defined as the first value to consist of only YYYY or YYYY-ZZZZ, with no other text. If no "best" match is found, the first value is used.
Parameters
- $input : array<string|int, mixed>
-
DOM elements to search.
Return values
string —extractEarliestYear()
Try to find a four-digit year in a set of DOM elements.
public
static extractEarliestYear(array<string|int, mixed> $input) : string
Parameters
- $input : array<string|int, mixed>
-
DOM elements to search.
Return values
string —getApertureCommand()
Generic method for building Aperture Command
public
static getApertureCommand(string $input, string $output[, string $method = 'webcrawler' ]) : string
Parameters
- $input : string
-
name of input file | url
- $output : string
-
name of output file
- $method : string = 'webcrawler'
-
webcrawler | filecrawler
Return values
string —command to be executed
getChangeTracker()
Get the change tracker service object.
public
static getChangeTracker() : ChangeTrackerServiceInterface
Return values
ChangeTrackerServiceInterface —getConfig()
Get a configuration file.
public
static getConfig([string $config = 'config' ]) : Config
Parameters
- $config : string = 'config'
-
Configuration name
Return values
Config —getDocument()
Harvest the contents of a document file (PDF, Word, etc.) using Aperture.
public
static getDocument(string $url) : string
This method will only work if Aperture is properly configured in the web/conf/fulltext.ini file. Without proper configuration, this will simply return an empty string.
Parameters
- $url : string
-
URL of file to retrieve.
Return values
string —text contents of file.
getFirstIndexed()
Get the date/time of the first time this record was indexed.
public
static getFirstIndexed(string $core, string $id, string $date) : string
Parameters
- $core : string
-
Solr core holding this record.
- $id : string
-
Record ID within specified core.
- $date : string
-
Date record was last modified.
Return values
string —First index date/time.
getLastIndexed()
Get the date/time of the most recent time this record was indexed.
public
static getLastIndexed(string $core, string $id, string $date) : string
Parameters
- $core : string
-
Solr core holding this record.
- $id : string
-
Record ID within specified core.
- $date : string
-
Date record was last modified.
Return values
string —Latest index date/time.
getParser()
Read parser method from fulltext.ini
public
static getParser() : string
Return values
string —Name of parser to use (i.e. Aperture or Tika)
getTikaCommand()
Generic method for building Tika command
public
static getTikaCommand(string $input, string $output, string $arg) : array<string|int, mixed>
Parameters
- $input : string
-
url | fileresource
- $output : string
-
name of output file
- $arg : string
-
optional Tika arguments
Return values
array<string|int, mixed> —Parameters for proc_open command
harvestTextFile()
Harvest the contents of a text file for inclusion in the output.
public
static harvestTextFile(string $url) : string
Parameters
- $url : string
-
URL of file to retrieve.
Return values
string —file contents.
harvestWithAperture()
Harvest the contents of a document file (PDF, Word, etc.) using Aperture.
public
static harvestWithAperture(string $url[, string $method = 'webcrawler' ]) : string
This method will only work if Aperture is properly configured in the fulltext.ini file. Without proper configuration, this will simply return an empty string.
Parameters
- $url : string
-
URL of file to retrieve.
- $method : string = 'webcrawler'
-
webcrawler | filecrawler
Return values
string —text contents of file.
harvestWithParser()
Call parsing method based on parser setting in fulltext.ini
public
static harvestWithParser(string $url) : string
Parameters
- $url : string
-
URL to harvest
Return values
string —Text contents of URL
harvestWithTika()
Harvest the contents of a document file (PDF, Word, etc.) using Tika.
public
static harvestWithTika(string $url[, string $arg = '--text' ]) : string
This method will only work if Tika is properly configured in the fulltext.ini file. Without proper configuration, this will simply return an empty string.
Parameters
- $url : string
-
URL of file to retrieve.
- $arg : string = '--text'
-
optional argument(s) for Tika
Return values
string —text contents of file.
implode()
Proxy the implode PHP function for use in XSL transformation.
public
static implode(string $glue, array<string|int, mixed> $pieces) : string
Parameters
- $glue : string
-
Glue string
- $pieces : array<string|int, mixed>
-
DOM elements to join together.
Return values
string —invertName()
Invert "Firstname Lastname" authors into "Lastname, Firstname."
public
static invertName(string $rawName) : string
Parameters
- $rawName : string
-
Raw name
Return values
string —invertNames()
Call invertName on all matching elements; return a DOMDocument with a name tag for each inverted name.
public
static invertNames(array<string|int, mixed> $input) : DOMDocument
Parameters
- $input : array<string|int, mixed>
-
DOM elements to adjust
Return values
DOMDocument —isInvertedName()
Is the provided name inverted ("Last, First") or not ("First Last")?
public
static isInvertedName(string $name) : bool
Parameters
- $name : string
-
Name to check
Return values
bool —mapString()
Map string using a config file from the translation_maps folder.
public
static mapString(string $in, string $filename) : string
Parameters
- $in : string
-
string to map.
- $filename : string
-
filename of map file
Return values
string —mapped text.
removeOuterBrackets()
Remove single square bracket characters if they are the start and/or end chars (matched or unmatched) and are the only square bracket chars in the string.
public
static removeOuterBrackets(string $str) : string
Ported from SolrMarc's DataUtil class.
Parameters
- $str : string
-
Text string with possible enclosing brackets
Return values
string —Processed string with the brackets removed.
removeTagAndReturnXMLasText()
Remove a given tag from the provided nodes, then convert into XML and return as text. This is useful for populating the fullrecord field with the raw input XML but allow for removal of certain elements (eg: full text field).
public
static removeTagAndReturnXMLasText(array<string|int, mixed> $in, string $tag) : string
Parameters
- $in : array<string|int, mixed>
-
array of DOMElement objects.
- $tag : string
-
name of tag to remove
Return values
string —XML as string
setServiceLocator()
Set the service locator.
public
static setServiceLocator(ServiceLocatorInterface $serviceLocator) : void
Parameters
- $serviceLocator : ServiceLocatorInterface
-
Locator to register
Return values
void —solrMarcStyleCleanData()
Port of logic from SolrMarc's DataUtil::cleanData method.
public
static solrMarcStyleCleanData(string $str) : string
Parameters
- $str : string
-
String to process.
Return values
string —Processed string.
stripAccents()
Strip accents from a string.
public
static stripAccents(string $str) : string
Parameters
- $str : string
-
String to process.
Return values
string —Processed string.
stripArticles()
Strip articles from the front of the text (for creating sortable titles).
public
static stripArticles(string $in) : string
Parameters
- $in : string
-
title to process.
Return values
string —article-stripped text.
stripBadChars()
Strip illegal XML characters from a string.
public
static stripBadChars(string $in) : string
Parameters
- $in : string
-
String to process
Return values
string —stripPunctuation()
Strip punctuation from a string.
public
static stripPunctuation(string $str) : string
Parameters
- $str : string
-
String to process.
Return values
string —Processed string.
titleSortLower()
Perform text processing roughly equivalent to SolrMarc's titleSortLower feature to allow consistent indexing into the title_sort field.
public
static titleSortLower(string $str) : string
Parameters
- $str : string
-
String to process.
Return values
string —Processed string.
xmlAsText()
Convert provided nodes into XML and return as text. This is useful for populating the fullrecord field with the raw input XML.
public
static xmlAsText(array<string|int, mixed> $in) : string
Parameters
- $in : array<string|int, mixed>
-
array of DOMElement objects.
Return values
string —XML as string
getApertureFields()
Load metadata about an HTML document using Aperture.
protected
static getApertureFields(string $htmlFile) : array<string|int, mixed>
Parameters
- $htmlFile : string
-
File on disk containing HTML.
Return values
array<string|int, mixed> —getDocumentFieldArray()
Support method for getDocument() -- retrieve associative array of field data.
protected
static getDocumentFieldArray(string $url) : array<string|int, mixed>
Parameters
- $url : string
-
URL of file to retrieve.
Return values
array<string|int, mixed> —getHtmlFields()
Extract key metadata from HTML.
protected
static getHtmlFields(string $html) : array<string|int, mixed>
NOTE: This method uses some non-standard meta tags; it is intended as an example that can be overridden/extended to support local practices.
Parameters
- $html : string
-
HTML content.
Return values
array<string|int, mixed> —getTikaFields()
Load metadata about an HTML document using Tika.
protected
static getTikaFields(string $htmlFile) : array<string|int, mixed>
Parameters
- $htmlFile : string
-
File on disk containing HTML.