Print

Print


Good call. I ended up rewriting the bookmarklet to look at NYT's <meta> 
tags rather than the body text as they are more uniform and usually 
contain the info we need. They have fairly consistent use across the 
archives for the years I quickly tested (2003-present).

The script now looks for a published title in the meta tags, and if it 
doesn't find one, looks for the headline used in the article. I changed 
the date search to a fuzzy match by month and year rather than by exact 
date, since pubdate can be inconsistent with Lexis's.

Highly unscientific and not 100% accurate, so this bookmarklet's 
definitely a bit of a hack, but should work *most* of the time for 
articles that made it into the print version.

https://gist.github.com/944809#file_nyt_lexis_bookmarklet_meta_tags.js

--
Erin White
Web Applications Developer, VCU Libraries
804-827-3552 | [log in to unmask] | http://library.vcu.edu/




From:
Bob Duncan <[log in to unmask]>
To:
[log in to unmask]
Date:
04/29/2011 10:42 AM
Subject:
Re: [CODE4LIB] NY Times Bookmarklet
Sent by:
Code for Libraries <[log in to unmask]>



>Date:    Wed, 27 Apr 2011 09:10:20 -0400
>From:    "Van Mil, James (vanmiljf)" <[log in to unmask]>
>Subject: NY Times Bookmarklet
>. . .
>However, every article at the web version of the NY Times that was 
>also published in the print version includes a reference to the 
>article from the print edition, including date, page number, and 
>print version title (information which is all still accessible in 
>the page source when the paywall blocks access).


I wish this were true, but unfortunately, it's not.  Not every 
reference to the print version includes the print version 
headline.  In fact, it appears that including the print headline is a 
fairly recent addition to the Times Website.  (Very unscientific 
searching suggests it started within the last few weeks.)  I wonder 
if it might make more sense to grab the author's name and pass that 
with the print pub date to PQ/LexisNexis instead -- most articles 
seem to include a byline.  Or grab the beginning sentence and pass 
that.  (You'd have to get rid of any anchor elements.)  It also 
appears that not every article that's published in print includes a 
reference to the print version in the Web version, but most seem to.

Bob Duncan


~!~!~!~!~!~!~!~!~!~!~!~!~
Robert E. Duncan
Systems Librarian
Lafayette College
Easton, PA  18042
[log in to unmask]
http://library.lafayette.edu/