On Jan 17, 2007, at 3:26 PM, Andrew Nagy wrote:
> One thing I am hoping that can come out of the preconference is a
> standard XSLT doc. I sat down with my metadata librarian to
> develop our
> XSLT doc -- determining what fields are to be searchable what fields
> should be left out to help speed up results, etc.
>
> It's pretty easy, I think you will be amazed how fast you can have a
> functioning system with very little effort.
You're quite right with that last statement.
I am, however, skeptical of a purely MARC -> XSLT -> Solr solution.
The MARC data I've seen requires some basic cleanup (removing dots at
the end of subjects, normalizing dates, etc) in order to be useful as
facets. While XSLT is powerful, this type of data manipulation is
better (IMO) done with scripting languages that allow for easy
tweaking in a succinct way. I'm sure XSLT could do everything that
you'd want done; you can also drive screws in with a hammer :)
That being said - if you've got XSLT chops, and can easily go from
MARC XML to Solr's XML [1], you'll be in great shape at the pre-
conference for quickly getting your data into Solr and seeing what
needs to be cleaned up. Seeing the raw data in a faceted way is
actually very helpful for knowing where to go next with cleanup, and
showing catalogers where inconsistencies live in the data.
Erik
[1] http://wiki.apache.org/solr/UpdateXmlMessages
|