Print

Print


Yeah, that's a good point, Eric.

I am, however, worried that I can't do what I want to do without
breaking 500 querries a day, and my institution is not going to be
willing to pay for it. So I'm interested in exploring other
opportunities. (Does Umlaut really not exceed 500 querries a day, for
instance?).

I am also interested in publically shared and open sourced algorithms
for workset grouping, that we can all collectively work on to improve
the state of our collective knowledge.  I am unhappy that 'our'
collective institution (OCLC) keeps the products of it's research (such
as the workset algorithm currently being used, but there are other
significant examples many of us know of) as trade secrets, and am
interested in a research project that would not do so.

If 'our' collective institution, OCLC, would share the results of it's
research as open-sourced algorithms, and would provide the services I
need at more affordable costs, then  of course neither of those would be
neccesary. One option is certainly spending time on trying to lobby OCLC
to behave differently. Another option is creating an alternative. Both
are to me legitimate options.

Jonathan

Eric Hellman wrote:
> Jonathan,
>
> It's worth noting that OCLC *is* the "we" you are talking about.
>
> OCLC member libraries contribute resources to do exactly what you
> suggest, and to do it in a way that is sustainable for the long term.
> Worldcat is created and maintained by libraries and by librarians.
> I'm the last to suggest that OCLC is the best possible instantiation
> of libraries-working-together, but we do try.
>
>
> Eric
>
>
>
> At 3:01 PM -0400 5/9/07, Jonathan Rochkind wrote:
>> 2) More interesting---OCLC's _initial_ work set grouping algorithm is
>> public. However, we know they've done a lot of additional work to
>> fine-tune the work set grouping algorithms.
>> (http://www.frbr.org/2007/01/16/midwinter-implementers).  Some of these
>> algorithms probably take advantage of all the cool data OCLC has that we
>> don't, okay.
>>
>> But how about we start working to re-create this algorithm? "Re-create"
>> isn't a good word, because we aren't going to violate any NDA's, we're
>> going to develop/invent our own algorithm, but this one is going to be
>> open source, not a trade secret like OCLC's.
>>
>> So we develop an algorithm on our own, and we run that algorithm on our
>> own data. Our own local catalog. Union catalogs. Conglomerations of
>> different catalogs that we do ourselves. Even reproductions of the OCLC
>> corpus (or significant subsets thereof) that we manage to assemble in
>> ways that don't violate copyright or license agreements.
>>
>> And then we've got our own workset grouping service. Which is really all
>> xISBN is.  What is OCLC providing that is so special? Well, if what I've
>> just outlined above is so much work that we _can't_ pull it off, then I
>> guess we've got pay OCLC, and if we are willing to do so (rather than go
>> without the service), then I guess OCLC has correctly pegged their
>> market price.
>>
>> But our field is not a healthy field if all research is being done by
>> OCLC and other vendors. We need research from other places, we need
>> research that produces public domain results, not proprietary trade
>> secrets.
>>
>
> --
>
> Eric Hellman, Director                            OCLC Openly
> Informatics Division
> [log in to unmask]                                    2 Broad St., Suite 208
> tel 1-973-509-7800 fax 1-734-468-6216              Bloomfield, NJ 07003
> http://openly.oclc.org/1cate/      1 Click Access To Everything
>

--
Jonathan Rochkind
Sr. Programmer/Analyst
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu