Print

Print


i know that there are institutions that have negotiated contracts for just
the content, sans interface.  But those that I know of have TONS of money
and are using a 3rd party interface that ingests the data for them.  I'm not
sure what the terms of that contract were or how they get the data, but it
can be done.



On Wed, Jun 30, 2010 at 5:07 PM, Cory Rockliff <[log in to unmask]>wrote:

> We're looking at an infrastructure based on Marklogic running on Amazon
> EC2, so the scale of data to be indexed shouldn't actually be that big of an
> issue. Also, as I said to Jonathan, I only see myself indexing a handful of
> highly-relevant resources, so we're talking millions, rather than 100s of
> millions, of records.
>
>
> On 6/30/2010 4:22 PM, Walker, David wrote:
>
>> You might also need to factor in an extra server or three (in the cloud or
>> otherwise) into that equation, given that we're talking 100s of millions of
>> records that will need to be indexed.
>>
>>
>>
>>> companies like iii and Ex Libris are the only ones with
>>> enough clout to negotiate access
>>>
>>>
>> I don't think III is doing any kind of aggregated indexing, hence their
>> decision to try and leverage APIs.  I could be wrong.
>>
>> --Dave
>>
>> ==================
>> David Walker
>> Library Web Services Manager
>> California State University
>> http://xerxes.calstate.edu
>> ________________________________________
>> From: Code for Libraries [[log in to unmask]] On Behalf Of Jonathan
>> Rochkind [[log in to unmask]]
>> Sent: Wednesday, June 30, 2010 1:15 PM
>> To: [log in to unmask]
>> Subject: Re: [CODE4LIB] DIY aggregate index
>>
>> Cory Rockliff wrote:
>>
>>
>>> Do libraries opt for these commercial 'pre-indexed' services simply
>>> because they're a good value proposition compared to all the work of
>>> indexing multiple resources from multiple vendors into one local index,
>>> or is it that companies like iii and Ex Libris are the only ones with
>>> enough clout to negotiate access to otherwise-unavailable database
>>> vendors' content?
>>>
>>>
>>>
>> A little bit of both, I think. A library probably _could_ negotiate
>> access to that content... but it would be a heck of a lot of work. When
>> the staff time to negotiations come in, it becomes a good value
>> proposition, regardless of how much the licensing would cost you.  And
>> yeah, then the staff time to actually ingest and normalize and
>> troubleshoot data-flows for all that stuff on the regular basis -- I've
>> heard stories of libraries that tried to do that in the early 90s and it
>> was nightmarish.
>>
>> So, actually, I guess i've arrived at convincing myself it's mostly
>> "good value proposition", in that a library probably can't afford to do
>> that on their own, with or without licensing issues.
>>
>> But I'd really love to see you try anyway, maybe I'm wrong. :)
>>
>>
>>
>>> Can I assume that if a database vendor has exposed their content to me
>>> as a subscriber, whether via z39.50 or a web service or whatever, that
>>> I'm free to cache and index all that metadata locally if I so choose? Is
>>> this something to be negotiated on a vendor-by-vendor basis, or is it an
>>> impossibility?
>>>
>>>
>>>
>> I doubt you can assume that.  I don't think it's an impossibility.
>>
>> Jonathan
>> ---
>> [This E-mail scanned for viruses by Declude Virus]
>>
>>
>>
>>
>>
>
>
> --
> Cory Rockliff
> Technical Services Librarian
> Bard Graduate Center: Decorative Arts, Design History, Material Culture
> 18 West 86th Street
> New York, NY 10024
> T: (212) 501-3037
> [log in to unmask]
>
> ---
> [This E-mail scanned for viruses by Declude Virus]
>