[Apologies, as always, for any cross-post copies]
The traject <https://github.com/traject-project/traject/> maintainers are
happy to announce the release of traject version 2.0.0.
Traject is an ETL (extract/transform/load) system designed and optimized
for indexing MARC records into Solr. It is similar in functionality to
solrmarc <https://code.google.com/p/solrmarc/>, but with everything written
in ruby instead of java.
Traject 2.0 brings several notable changes:
- Support for MRI (“normal”) rub
, and rbx
- New Solr JSON writer (for solr versions >=3.2) accessible from MRI and
with about 20% better performance than previous indexing.
- New writers for producing tab-delimited/CSV files
(Note that while traject runs fine under MRI, you’ll get substantially
faster indexing using JRuby due to traject’s use of multiple threads when
Traject is in production use indexing metadata for the library catalogs of
the University of Michigan, the HathiTrust, Johns Hopkins, and Brown
University. (Using Traject? Let us know!)
The traject README <https://github.com/traject-project/traject/> and doc
contain reference information, and we also provide a sample real-ish
configuration <https://github.com/traject-project/traject_sample> to
help get you started.
Brown University is using traject for a new search interface; the Brown
is a great example of a real-life traject installation.
The University of Michigan and Hathitrust catalog are also indexed with
traject; their shared configuration
<https://github.com/billdueber/ht_traject> provides another (potentially
overly)-complex real-life set of configuration files.
Thanks to everyone who provided feedback for this release!
Feel free to contact me with questions directly, or add issues/ pull
requests to the github project <https://github.com/traject-project/traject/>
Library Systems Programmer
University of Michigan Library