I also use the IMPORTXML function for Google Sheets, as well as the Right
Click > Scrape Similar option in Chrome (I think it's an extension I had to
add). For Google Sheets I also use Wikidata & Wikipedia Tools extension
(which can be helpful for authority work: if your name matches a Wikidata
entry, you can query out the life dates. If you're satisfied it's a correct
match, you can query out more standard identifiers, such as Library of
Congress ID, VIAF ID, etc.). I've used these tricks/tools in cataloging (to
avoid unnecessary transcribing large sets of titles, when someone else
already has). And I use them often for grabbing biographical info (often
from poorly structured sites) for PIC <http://pic.nypl.org>.
*David Lowe | The New York Public Library**Specialist II, Photography
*Photographers' Identities Catalog <http://pic.nypl.org>*
On Tue, Nov 28, 2017 at 2:08 PM, Brett <[log in to unmask]> wrote:
> I leveraged the IMPORTXML() and xpath features in Google Sheets to pull
> information from a large university website to help create a set of weeding
> lists for a branch campus. They needed extra details about what was in
> off-site storage and what was held at the central campus library.
> This was very much like Jason's FIFO API, the central reporting group had
> sent me a spreadsheet with horrible data that I would have had to sort out
> almost completely manually, but the call numbers were pristine. I used the
> call numbers as a key to query the catalog with limits for each campus I
> needed to check, and then it dumped all of the necessary content (holdings,
> dates, etc) into the spreadsheet.
> I've also used Feed43 as a way to modify certain RSS feeds and scrape
> websites to only display the content I want.
> Brett Williams
> On Tue, Nov 28, 2017 at 1:24 PM, Brad Coffield <
> [log in to unmask]>
> > I think there's likely a lot of possibilities out there and was hoping to
> > hear examples of web scraping for libraries. Your example might just
> > inspire me or another reader to do something similar. At the very least,
> > the ideas will be interesting!
> > Brad
> > --
> > Brad Coffield, MLIS
> > Assistant Information and Web Services Librarian
> > Saint Francis University
> > 814-472-3315
> > [log in to unmask]