Good call. I ended up rewriting the bookmarklet to look at NYT's <meta> tags rather than the body text as they are more uniform and usually contain the info we need. They have fairly consistent use across the archives for the years I quickly tested (2003-present). The script now looks for a published title in the meta tags, and if it doesn't find one, looks for the headline used in the article. I changed the date search to a fuzzy match by month and year rather than by exact date, since pubdate can be inconsistent with Lexis's. Highly unscientific and not 100% accurate, so this bookmarklet's definitely a bit of a hack, but should work *most* of the time for articles that made it into the print version. https://gist.github.com/944809#file_nyt_lexis_bookmarklet_meta_tags.js -- Erin White Web Applications Developer, VCU Libraries 804-827-3552 | [log in to unmask] | http://library.vcu.edu/ From: Bob Duncan <[log in to unmask]> To: [log in to unmask] Date: 04/29/2011 10:42 AM Subject: Re: [CODE4LIB] NY Times Bookmarklet Sent by: Code for Libraries <[log in to unmask]> >Date: Wed, 27 Apr 2011 09:10:20 -0400 >From: "Van Mil, James (vanmiljf)" <[log in to unmask]> >Subject: NY Times Bookmarklet >. . . >However, every article at the web version of the NY Times that was >also published in the print version includes a reference to the >article from the print edition, including date, page number, and >print version title (information which is all still accessible in >the page source when the paywall blocks access). I wish this were true, but unfortunately, it's not. Not every reference to the print version includes the print version headline. In fact, it appears that including the print headline is a fairly recent addition to the Times Website. (Very unscientific searching suggests it started within the last few weeks.) I wonder if it might make more sense to grab the author's name and pass that with the print pub date to PQ/LexisNexis instead -- most articles seem to include a byline. Or grab the beginning sentence and pass that. (You'd have to get rid of any anchor elements.) It also appears that not every article that's published in print includes a reference to the print version in the Web version, but most seem to. Bob Duncan ~!~!~!~!~!~!~!~!~!~!~!~!~ Robert E. Duncan Systems Librarian Lafayette College Easton, PA 18042 [log in to unmask] http://library.lafayette.edu/