Hi, Alberto. We haven't rolled it out yet, but at University of Virginia we've had great success with solr/lucene. You can take a look at what we've done here: http://blacklight.betech.virginia.edu/ Solr itself meets all the criteria you just listed, and Erik Hatcher wrote some very nice code for us (and it's open source, although I'm not sure if it's actually in a publicly accessible repository right now.) to import our Marc records. We currently have about 4 million marc records, plus EAD, TEI, and GDMS XML files, plus some HTML files that Erik indexed as a proof of concept. The front end is Ruby on Rails, and the mapping files for cross-walking the XML or Marc into solr is very easy to configure, even for a non-programmer. In fact, we are planning to just hand these files to our cataloguers and letting them hash things out themselves. I'm currently working on a formal deployment plan that we have to have before we can make this a production service, and this will include workflows for syncing the data with our Sirsi ILS. If this is interesting and you want more info, I or other folks at UVa would be happy to talk with you about it. Or if you need to roll your own system, I would highly recommend solr. It makes lucene very easy to work with and provides all kinds of added functionality. Bess On Jul 18, 2007, at 10:45 AM, Alberto Accomazzi wrote: > Our project is looking to transition to a new search engine to handle > our bibliographic databases (5.5M records of bibliographic article > metadata + 0.6M fulltext articles). What we are looking for is > something easily tweakable, which offers fielded searches, > boolean/simple search logic, customizable relevance ranking, > proximity, > highlighting, synonym/stemming matching. Needs to run on a linux > 64-bit > box. The packages I am aware of are: > > 1. lucene/clucene/lucy > 2. kinosearch > 3. xapian > 4. zebra > 5. invenio > > Am I missing any from the list? Are any of these to be excluded based > on our requirements? I'd like to hear experiences from people who are > using or have used these packages. > > TIA > > -- Alberto > > ******************************************************************** > Dr. Alberto Accomazzi aaccomazzi(at)cfa harvard edu > NASA Astrophysics Data System ads.harvard.edu > Harvard-Smithsonian Center for Astrophysics www.cfa.harvard.edu > 60 Garden St, MS 67, Cambridge, MA 02138, USA > ******************************************************************** Elizabeth (Bess) Sadler Head, Technical and Metadata Services Digital Scholarship Services Box 400129 Alderman Library University of Virginia Charlottesville, VA 22904 [log in to unmask] (434) 243-2305