We're looking at an infrastructure based on Marklogic running on Amazon
EC2, so the scale of data to be indexed shouldn't actually be that big
of an issue. Also, as I said to Jonathan, I only see myself indexing a
handful of highly-relevant resources, so we're talking millions, rather
than 100s of millions, of records.
On 6/30/2010 4:22 PM, Walker, David wrote:
> You might also need to factor in an extra server or three (in the cloud or otherwise) into that equation, given that we're talking 100s of millions of records that will need to be indexed.
>
>
>> companies like iii and Ex Libris are the only ones with
>> enough clout to negotiate access
>>
> I don't think III is doing any kind of aggregated indexing, hence their decision to try and leverage APIs. I could be wrong.
>
> --Dave
>
> ==================
> David Walker
> Library Web Services Manager
> California State University
> http://xerxes.calstate.edu
> ________________________________________
> From: Code for Libraries [[log in to unmask]] On Behalf Of Jonathan Rochkind [[log in to unmask]]
> Sent: Wednesday, June 30, 2010 1:15 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] DIY aggregate index
>
> Cory Rockliff wrote:
>
>> Do libraries opt for these commercial 'pre-indexed' services simply
>> because they're a good value proposition compared to all the work of
>> indexing multiple resources from multiple vendors into one local index,
>> or is it that companies like iii and Ex Libris are the only ones with
>> enough clout to negotiate access to otherwise-unavailable database
>> vendors' content?
>>
>>
> A little bit of both, I think. A library probably _could_ negotiate
> access to that content... but it would be a heck of a lot of work. When
> the staff time to negotiations come in, it becomes a good value
> proposition, regardless of how much the licensing would cost you. And
> yeah, then the staff time to actually ingest and normalize and
> troubleshoot data-flows for all that stuff on the regular basis -- I've
> heard stories of libraries that tried to do that in the early 90s and it
> was nightmarish.
>
> So, actually, I guess i've arrived at convincing myself it's mostly
> "good value proposition", in that a library probably can't afford to do
> that on their own, with or without licensing issues.
>
> But I'd really love to see you try anyway, maybe I'm wrong. :)
>
>
>> Can I assume that if a database vendor has exposed their content to me
>> as a subscriber, whether via z39.50 or a web service or whatever, that
>> I'm free to cache and index all that metadata locally if I so choose? Is
>> this something to be negotiated on a vendor-by-vendor basis, or is it an
>> impossibility?
>>
>>
> I doubt you can assume that. I don't think it's an impossibility.
>
> Jonathan
> ---
> [This E-mail scanned for viruses by Declude Virus]
>
>
>
>
--
Cory Rockliff
Technical Services Librarian
Bard Graduate Center: Decorative Arts, Design History, Material Culture
18 West 86th Street
New York, NY 10024
T: (212) 501-3037
[log in to unmask]
---
[This E-mail scanned for viruses by Declude Virus]
|