>
> That's why I'd love to know whether the xISBN database uses a common
> identifier for each set of ISBNs, and whether (and I know 'pretty
> please' is a poor justification for changing an API) it might be exposed
> for this reason.
>
Hopefully the OCLC people can answer that. It might be in the work Andy
suggested yesterday. One idea I had while yesterday was if you don't
care that much about the id internally you could use an auto-increment.
To clarify, we'll assume that any isbn in a set will return the same set
in xISBN.
IE asking for isbns related to a returns a,b and c. Asking for b or c
should return a,b,c.
So we can do as Andy suggested and start building our table by taking the
set of all current isbns, normalized a bit I'd imagine.
In a computationally-expensive method:
Start with the first isbn (x) and get the set of isbns from xISBN that is
related (A). Iterate over every member of A testing for the following:
is the member assigned to a group already. If it has, stop the loop and
assign x to the same group. If none in A have been assigned a group,
start a new group and add x.
You'll have to do this every once in a while to make sure you're getting
all the new books.
Hopefully this makes up for the advice I gave yesterday ;). I'm
sure you can probably come up with a better algorithm though,
something about the backward-lookup everytime makes me think that
there's a better way.
ps. Andy's right, normalization is a good, good thing. Only reason I
suggested looking at the costs was I was thinking it would be a lot easier
than trying to come up with a method to generate unique ids for a "group"
since my grasp of FRBR/xISBN is a little shaky I'll avoid any specific
terminology.
(Like I said in my original email, having a identifier or groups is a
definite advangtage).
|