Print

Print


On Aug 6, 2010, at 10:53 AM, Mark A. Matienzo wrote:

> On Fri, Aug 6, 2010 at 1:09 PM, Bess Sadler <[log in to unmask]> wrote:
>> On Aug 6, 2010, at 9:10 AM, Jonathan Rochkind wrote:
>> 
>>>> We indexed each EAD guide into separate lucene documents for each EAD section, then collapsed them under the main EAD title in the search results,
>>> 
>>> Curious how you impelemented that: Did you use the Solr "field collapsing" patch that's not yet part of a standard distro?
>> 
>> Yes, exactly.
> 
> Bess - would you be willing to share code or brief notes about how to
> set this up?

Gladly. I will write it as a separate message though, for ease of future reference. 

> 
>> Yeah, that's a good point. We were trying to self-contain the whole thing for ease of deployment, but I'm not sure that's a good approach. It's better if your EAD is in a real repository and Blacklight just presents it.
> 
> +1. Potential options could include using an XML database like eXist,
> or using our approach at Yale (where EAD finding aids are stored as
> datastreams in Fedora objects). I've been eager to look at rethinking
> our approach, especially given the availability of the Hydra codebase.

Absolutely. Also, this is one example i can think of where fedora disseminators make perfect sense. Fedora can serve as your repository, and then each guide can be accessed as 

http://your.repository.edu/fedora/get/YOUR_EAD_IDENTIFIER

and each section can be grabbed via 

http://your.repository.edu/fedora/get/YOUR_EAD_IDENTIFIER/bioghist (or whatever naming scheme makes sense to those with stronger opinions about EAD than I do) 

What I'd love to see is each item represented and described independently in the repository, and then a full XML serialization of the EAD would just be constructed on the fly, bringing in as serialization time any objects that belong in a given section of the document. 

Institutionally, the biggest problem with EAD is version control and workflow for keeping the documents up to date. I think splitting things up into separate objects and only contructing the full EAD document as needed is a good potential solution to this. 

Bess