Author, title, and publication year.... won't get you many false
positives, but might get you lots of false negatives.
It's certainly true that there is no good "naive" approach to matching
without identifiers and getting a good balance of minimal false
positives and false negatives. There are tricky ways to approach it I
haven't really tried yet, you can sometimes get closer to "good enough"
than you think with just author/title or author/title/year.
Depends on the source of your data too. If you have an AACR2/NAF
controlled heading for an author, instead of just a free-text author
entry field, that certainly makes it easier.
Jonathan
Kyle Banerjee wrote:
>> So, the purpose of this would be to discover where a given item represented
>> by the OpenURL was held. A secondary purpose would be as a source of
>> bibliographic citation information This could be quite useful discovery
>> tool, especially for materials that are not widely held.
>>
>>
>
> Still trying to wrap my mind around your use case. First of all, are you
> thinking about journals as well as other materials? Or do you just want to
> find matches based on title, author, and other OpenURL elements?
>
> Be aware that for any search that doesn't involve a known identifier, you're
> going to run into major issues with false matches and duplicate entries.
> Also, quality/completeness of data in records for obscure items is often
> poor.
>
> kyle
>
>
|