On 02/24/2012 09:25 AM, Kyle Banerjee wrote:
>> We use OAI-PMH, and while we often see (usually general and sometimes
>> contradictory) statements about what we can/can't do with the contents of a
>> repository (or a specific record), it feels like there isn't a nice simple
>> mechanism for a repository to say "don't harvest this bit".
>>
>
> I would argue there is -- the whole point of OAI-PMH is to make stuff
> available for harvesting. If someone goes to the trouble of making things
> available via a protocol that exists only to make things harvestable and
> then doesn't want it harvested, you can dismiss them as being totally
> mental.
The M in PMH still stands for Metadata, right? So opening an OAI-PMH
server implicitly says you're willing to share metadata. I can certainly
sympathize with sites wanting to do that but not necessarily wanting to
offer anything more than "normal" end-user access to full text.
That said, in a world with unfriendly bots, the repository should still be
making informed choices about controlling full text crawlers (robots.txt,
meta tags, HTTP cache directives, etc etc.).
--
Thomas Dowling
[log in to unmask]
|