About Features Downloads Getting Started Documentation Events Support GitHub

Site Tools


indexing:csv

Importing CSV Files

:!: This feature was added in VuFind 8.0.

Tabular data is commonly distributed as “comma-separated values” (CSV) files, and any spreadsheet-style data can be easily converted to the CSV format. While the CSV format is somewhat limited (particularly if you are dealing with multi-valued fields), it is commonly used, and may be a useful source of input for your VuFind index. Fortunately, a tool exists to load CSV files directly into VuFind's Solr instance.

Import Workflow

Loading CSV files can be done with a two-step process:

1. Create configuration

Create a configuration file in the “import” subdirectory of your local settings directory. You can use any filename that you like – it usually makes sense to have a separate configuration file for each CSV file you load, since content and formatting is likely to be unique to each file. You can find an example in $VUFIND_HOME/import/csv.ini. This documents all of the available settings. You should make a local copy of this example (in your $VUFIND_LOCAL_DIR/import directory) and then configure the desired settings.

The configuration file gives you a lot of control over how the CSV file is read, including settings to handle header rows, different character encodings, mapping of columns to Solr fields, manipulation of data values with custom code, and more. See the comments in the csv.ini example for full details.

2. Load the data

Run the command line tool like this:

php $VUFIND_HOME/public/index.php import/import-csv /path/to/input_file.csv name_of_config.ini

In the above example, note that /path/to/input_file.csv is the full path to the input CSV file, whereas name_of_config.ini is simply the name of the configuration file you created without any path specified – VuFind will search for it automatically in your $VUFIND_LOCAL_DIR/import directory.

The ingest tool also supports some optional switches, including a –test-only mode which will display the result of the ingest on the console instead of sending it to Solr (useful for testing/debugging), and an –index switch you can use to load data into non-standard Solr cores. For full details, you can run:

php $VUFIND_HOME/public/index.php import/import-csv --help
indexing/csv.txt · Last modified: 2021/08/03 11:58 by demiankatz