Print

Print


Do you think? I reckon it is just a few lines of code in a custom source parser... Only need to:

Check rft.id contains an http uri (regexp)
Define a fetchID based on this URI (possibly + date/other metadata)
Get a URL or null from a lookup service
Insert URL or rft_id value into rft.856

Simple!

Owen

Owen Stephens
TELSTAR Project Manager
Library and Learning Resources Centre
The Open University
Walton Hall
Milton Keynes, MK7 6AA

T: +44 (0) 1908 858701
F: +44 (0) 1908 653571
E: [log in to unmask]


> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On
> Behalf Of Jonathan Rochkind
> Sent: 15 September 2009 16:30
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] Implementing OpenURL for simple web resources
>
> Wait, are you really going to try to do this with _SFX_ too?
>  I missed
> that part. Oh boy. Seriously, I think you are in for a world
> of painful hacky kludge.
>
> Rosalyn Metz wrote:
> > Owen,
> >
> > The reason I suggest a source parser rather than a target parser is
> > that handling the openurl based on the source rather than
> shave a bit
> > of time off.  Attached is a slide i created (back in the
> day when it
> > was my job to create such slides...no i don't sit around in my hole
> > creating slides because i'm bored...although.....) that shows the
> > process an OpenURL goes through.
> >
> > So the source parser in this example would come into play
> before the
> > OpenURL metadata hits the SFX KB.  It would bypass the
> bottom half of
> > the slide completely and reduce any weird formatting that SFX might
> > try to do to the metadata with a value like website (if you
> tell sfx
> > you're looking for an article but you're really looking for
> a book it
> > sometimes ignores metadata unrelated to an article even though you
> > might actually need it).  if you never let it get to that
> point, then
> > you don't need to worry about that "feature" coming into play.
> >
> > Source parsers aren't used as frequently as they once were,
> but they
> > used to be a way to retrieve more metadata from databases
> that didn't
> > create useful openurls (not that many vendors create useful
> openurls
> > now...).  but if you go a hackish route you could use a
> source parser
> > like a redirect rather than using it to fetch more metadata.
> >
> > If none of this makes sense let me know and i can try to
> describe it
> > better off list so as not to bore people into oblivion.
> >
> > Rosalyn
> >
> >
> >
> >
> > On Tue, Sep 15, 2009 at 9:52 AM, O.Stephens
> <[log in to unmask]> wrote:
> >
> >> Thanks Rosalyn,
> >>
> >> As you say we could push a custom value into rfr_genre. I'm a bit
> >> torn on this, as I guess I'm trying to do something that isn't
> >> 'hacky' - or at least not from the OpenURL end of it. It might be
> >> that this is just wishful thinking, and that I'm just
> trying to fool
> >> myself into thinking I'm 'sticking to the standard' when the
> >> likelihood of what I'm doing being transferrable to other
> scenarios
> >> is zero (although Eric's comments make me hope not)
> >>
> >> Yes, we are using SFX. What I'm proposing on the SFX end
> as the path of least resisitance is writing a source parser
> for our learning environment which can do a 'fetch' for an
> alternative URL, or use the primary URL, and put it in an SFX
> internal field rft_856. We can then use the existing Target
> Parser 856_URL which displays the contents of rft_856 in the
> menu. Combined with some logic which forces this as the only
> option under certain circumstances we can then push the user
> directly to the resulting URL.
> >>
> >> Owen
> >>
> >> Owen Stephens
> >> TELSTAR Project Manager
> >> Library and Learning Resources Centre The Open University
> Walton Hall
> >> Milton Keynes, MK7 6AA
> >>
> >> T: +44 (0) 1908 858701
> >> F: +44 (0) 1908 653571
> >> E: [log in to unmask]
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Code for Libraries
> [mailto:[log in to unmask]] On Behalf
> >>> Of Rosalyn Metz
> >>> Sent: 15 September 2009 14:42
> >>> To: [log in to unmask]
> >>> Subject: Re: [CODE4LIB] Implementing OpenURL for simple web
> >>> resources
> >>>
> >>> you could force a timestamp if people don't include a date.
> >>>
> >>> and I like the idea of going to the Internet Archive of a
> website,
> >>> because then you're not having to get into the business
> of handling
> >>> www.bbc.co.uk differently than cnn.com and someblog.org.
> >>>
> >>> i also like the idea of using a redirect.  you could
> theoretically
> >>> write a source parser (i'm assuming youre using SFX based on what
> >>> you said about bX) that says if my rfr_id = mylocalid and
> the item
> >>> is a website (however you choose to identify the
> website...which if
> >>> you're writing your own source parser you could put
> website in the
> >>> rft_genre even though its not technically a metadata
> format but you
> >>> just want your source parser to forward the url on anyway, so the
> >>> link resolver isn't actually going to do anything with it) bypass
> >>> everything and just direct to the internet archive of the website.
> >>>
> >>> all of this is of course kind of hackish...but really isn't the
> >>> whole thing hackish?  there were a few source parsers
> that would be
> >>> good models for writing something like this.
> >>> but i have no idea if they still exist because i haven't
> looked at
> >>> the back end of sfx in about a year.
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Sep 15, 2009 at 5:12 AM, O.Stephens
> <[log in to unmask]>
> >>> wrote:
> >>>
> >>>> I agree with this Rosalyn. The issue that Nate brought up
> >>>>
> >>> was that the content at http://www.bbc.co.uk could change
> over time,
> >>> and old content might be moved to another URI -
> >>> http://archive.bbc.co.uk or something. So if course A references
> >>> http://www.bbc.co.uk on 24/08/09, if the content that was on
> >>> http://www.bbc.co.uk on 24/08/09 moves to
> http://archive.bbc.co.uk
> >>> we can use the mechanism I propose to trap the links to
> >>> http://www.bbc.co.uk and redirect to http://archive.bbc.co.uk.
> >>> However, if at a later date course B references
> http://www.bbc.co.uk
> >>> we have no way of knowing whether they mean the stuff that is
> >>> currently on http://www.bbc.co.uk or the stuff that used to be on
> >>> http://www.bbc.co.uk and is now on
> http://archive.bbc.co.uk - and we
> >>> have a redirect that is being applied across the board.
> >>>
> >>>> Thinking about it, references are required to include a
> >>>>
> >>> date of access when citing websites, so this is probably the best
> >>> piece of information to use to know where to resolve to
> (and we can
> >>> put this in the DC metadata). Whether this will just get too
> >>> confusing is a good question - I'll have at think about this.
> >>>
> >>>> Owen
> >>>>
> >>>> PS using the date we could even consider resolving to the
> >>>>
> >>> Internet Archive copy of a website if it was available I guess -
> >>> this might be useful I guess...
> >>>
> >>>> Owen Stephens
> >>>> TELSTAR Project Manager
> >>>> Library and Learning Resources Centre The Open University Walton
> >>>> Hall Milton Keynes, MK7 6AA
> >>>>
> >>>> T: +44 (0) 1908 858701
> >>>> F: +44 (0) 1908 653571
> >>>> E: [log in to unmask]
> >>>>
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Code for Libraries [mailto:[log in to unmask]]
> >>>>>
> >>> On Behalf
> >>>
> >>>>> Of Rosalyn Metz
> >>>>> Sent: 14 September 2009 21:52
> >>>>> To: [log in to unmask]
> >>>>> Subject: Re: [CODE4LIB] Implementing OpenURL for simple
> >>>>>
> >>> web resources
> >>>
> >>>>> oops...just re-read original post s/professor/article
> >>>>>
> >>>>> also your link resolver should be creating a context
> >>>>>
> >>> object with each
> >>>
> >>>>> request.  this context object is what makes the openurl
> >>>>>
> >>> unique.  so
> >>>
> >>>>> if you want uniqueness for stats purposes i would image
> the link
> >>>>> resolver is already doing that (and just another reason
> to use an
> >>>>> rfr_id that you create).
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Mon, Sep 14, 2009 at 4:45 PM, Rosalyn Metz
> >>>>>
> >>> <[log in to unmask]>
> >>>
> >>>>> wrote:
> >>>>>
> >>>>>> Owen,
> >>>>>>
> >>>>>> rft_id isn't really meant to be a unique identifier
> >>>>>>
> >>>>> (although it can
> >>>>>
> >>>>>> be in situations like a pmid or doi).  are you looking
> >>>>>>
> >>> for it to be?
> >>>
> >>>>>> if so why?
> >>>>>>
> >>>>>> if professor A is pointing to http://www.bbc.co.uk and
> >>>>>>
> >>>>> professor B is
> >>>>>
> >>>>>> pointing to http://www.bbc.co.uk why do they have to
> have unique
> >>>>>> OpenURLs.
> >>>>>>
> >>>>>> Rosalyn
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Sep 14, 2009 at 4:41 PM, Eric Hellman
> >>>>>>
> >>>>> <[log in to unmask]> wrote:
> >>>>>
> >>>>>>> Nate's point is what I was thinking about in this
> comment in my
> >>>>>>> original
> >>>>>>> reply:
> >>>>>>> If you don't add DC metadata, which seems like a good
> >>>>>>>
> >>> idea, you'll
> >>>
> >>>>>>> definitely want to include something that will help you
> >>>>>>>
> >>> to persist
> >>>
> >>>>>>> your replacement record. For example, a label or
> >>>>>>>
> >>>>> description for the link.
> >>>>>
> >>>>>>> I should also point out a solution that could work for
> >>>>>>>
> >>> some people
> >>>
> >>>>>>> but not
> >>>>>>> you- put rewrite rules in the gateways serving your
> >>>>>>>
> >>> network. A bit
> >>>
> >>>>>>> dangerous and kludgy, but we've seen kludgier things.
> >>>>>>>
> >>>>>>> On Sep 14, 2009, at 4:24 PM, O.Stephens wrote:
> >>>>>>>
> >>>>>>>> Nate has a point here - what if we end up with a commonly
> >>>>>>>>
> >>>>> used URI
> >>>>>
> >>>>>>>> pointing at a variety of different things over time, and
> >>>>>>>>
> >>>>> so is used
> >>>>>
> >>>>>>>> to indicate different content each time. However the
> >>>>>>>>
> >>>>> problem with a 'short URL'
> >>>>>
> >>>>>>>> solution (tr.im, purl etc), or indeed any locally assigned
> >>>>>>>> identifier that acts as a key, is that as described in
> >>>>>>>>
> >>>>> the blog post
> >>>>>
> >>>>>>>> you need prior knowledge of the short URL/identifier to
> >>>>>>>>
> >>>>> use it. The
> >>>>>
> >>>>>>>> only 'identifier' our authors know for a website is it's
> >>>>>>>>
> >>>>> URL - and
> >>>>>
> >>>>>>>> it seems contrary for us to ask them to use something
> >>>>>>>>
> >>> else. I'll
> >>>
> >>>>>>>> need to think about Nate's point - is this common or an
> >>>>>>>>
> >>>>> edge case? Is there any other approach we could take?
> >>>>>
> >>>>>>> Eric Hellman
> >>>>>>> President, Gluejar, Inc.
> >>>>>>> 41 Watchung Plaza, #132
> >>>>>>> Montclair, NJ 07042
> >>>>>>> USA
> >>>>>>>
> >>>>>>> [log in to unmask]
> >>>>>>> http://go-to-hellman.blogspot.com/
> >>>>>>>
> >>>>>>>
> >>>> The Open University is incorporated by Royal Charter (RC
> >>>>
> >>> 000391), an exempt charity in England & Wales and a charity
> >>> registered in Scotland (SC 038302).
> >>>
> >> The Open University is incorporated by Royal Charter (RC
> 000391), an exempt charity in England & Wales and a charity
> registered in Scotland (SC 038302).
> >>
> >>
>


The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).