I really think we need to look at ways to manage dynamic record-level data like circ status separately from the bibliographic metadata. To display that info, we can do a real-time lookup; but to use it in the faceted search interface we need a smarter solution. If we can figure out how to populate a facet (say "status_available", so I can limit my search to stuff I can get) based on frequent updates of circ transactions (item x was just lent, item y was just returned), then we don't have to be constantly extracting and reindexing the bib record, which hasn't changed.
I think this can be done by manipulating the facet's bitset directly, but this is based on my (probably imperfect) understanding of an earlier version of Solr's faceting code and needs to be confirmed. If this works, the same approach could be used to handle other dynamic data like user-provided tags. If we don't go this way, then we're stuck with a requirement to refetch and reindex the bib record every time the dynamic metadata changes, which seems to me like something we want to avoid if we're going to have those changes reflected in real time.
A requirement for this to work would be the ability to map from system ids to Lucene document ids (so that the bitset can be updated appropriately), but that can be done with a simple Lucene query. Of course, circ transactions involve holdings items while the opac wants to think in terms of bib items, so we need to think out the necessary structures.
How we get those updates from the circ system is of course a problem for each ILS.
Peter
-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Bess Sadler
Sent: Wednesday, January 17, 2007 2:00 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] Getting data from Voyager into XML?
On Jan 17, 2007, at 3:26 PM, Andrew Nagy wrote:
> One thing I am hoping that can come out of the preconference is a
> standard XSLT doc. I sat down with my metadata librarian to develop
> our XSLT doc -- determining what fields are to be searchable what
> fields should be left out to help speed up results, etc.
>
> It's pretty easy, I think you will be amazed how fast you can have a
> functioning system with very little effort.
>
> Andrew
As long as we're on the subject, does anyone want to share strategies for syncing circulation data? It sounds like we're all talking about the parallel systems á la NCSU's Endeca system, which I think is a great idea. It's the circ data that keeps nagging at me, though. Is there an elegant way to use your fancy new faceted browser to search against circ data w/out re-dumping the whole thing every night?
Bess
|