On 11/29/06, Andrew Nagy <[log in to unmask]> wrote: > > So ... while we are on this topic. You wouldn't want to index marcxml > records in lucene, you would use marc21, right? Why deal with the > overhead of xml if it is not necessary. We have to format our data no > matter what for to best fit our storage/search system. This seems like six of one and a half dozen of the other to me. I don't think Lucene cares either way which you use. In my mind, it is just a matter of preference... do I want to use XML tools (sax, xom, rexml) or MARC specific tools (marc4j, pymarc, ruby-marc). All could be used to build Lucene indices. On the other hand, what do I want to do with the data after it is indexed? Do I want to be able to display a whole record (versus just the little bit I might have stored in the Lucene index)? If so, I'd rather be working with XML. If I'm just pointing them back to my OPAC, though, I don't see much difference (other than personal preference) in the tool choice. Kevin