On Mon, Feb 27, 2012 at 8:31 AM, Diane Hillmann <[log in to unmask]>wrote: > On Mon, Feb 27, 2012 at 5:25 AM, Owen Stephens <[log in to unmask]> wrote: > > > > > This issue is certainly not unique to VT - we've come across this as part > > of our project. While the OAI-PMH record may point at the PDF, it can > also > > point to a intermediary page. This seems to be standard practice in some > > instances - I think because there is a desire, or even requirement, that > a > > user should see the intermediary page (which may contain rights > information > > etc.) before viewing the full-text item. There may also be an issue where > > multiple files exist for the same item - maybe several data files and a > pdf > > of the thesis attached to the same metadata record - as the metadata via > > OAI-PMH may not describe each asset. > > > > > This has been an issue since the early days of OAI-PMH, and many large > providers provide such intermediate pages (arxiv.org, for instance). The > other issue driving providers towards intermediate pages is that it allows > them to continue to derive statistics from usage of their materials, which > direct access URIs and multiple web caches don't. For providers dependent > on external funding, this is a biggie. > > Why do you place direct access URI and multiple web caches into the same category? I follow your argument re: usage statistics for web caches, but as long as the item remains hosted in the repository direct access URIs should still be counted (provided proper cache-control headers are sent.) Perhaps it would require server-side statistics rather than client-based GA. Also, it seems to me that except for Google full-text indexing engines don't necessarily want to be come providers of cached copies (certainly the discovery systems currently provided commercially don't AFAIK.) - Godmar