Brett, did you ask the folks at the Large University Library if they could
set something up for you? I don't have a good sense of how other
institutions deal with things like this.
In any case, I know I'd much rather talk about setting up an API or a
nightly dump or something rather than have my analytics (and bandwidth!)
blown by a screen scraper. I might say "no," but at least it would be an
informed "no" :-)
On Tue, Nov 28, 2017 at 2:08 PM, Brett <[log in to unmask]> wrote:
> I leveraged the IMPORTXML() and xpath features in Google Sheets to pull
> information from a large university website to help create a set of weeding
> lists for a branch campus. They needed extra details about what was in
> off-site storage and what was held at the central campus library.
>
> This was very much like Jason's FIFO API, the central reporting group had
> sent me a spreadsheet with horrible data that I would have had to sort out
> almost completely manually, but the call numbers were pristine. I used the
> call numbers as a key to query the catalog with limits for each campus I
> needed to check, and then it dumped all of the necessary content (holdings,
> dates, etc) into the spreadsheet.
>
> I've also used Feed43 as a way to modify certain RSS feeds and scrape
> websites to only display the content I want.
>
> Brett Williams
>
>
> On Tue, Nov 28, 2017 at 1:24 PM, Brad Coffield <
> [log in to unmask]>
> wrote:
>
> > I think there's likely a lot of possibilities out there and was hoping to
> > hear examples of web scraping for libraries. Your example might just
> > inspire me or another reader to do something similar. At the very least,
> > the ideas will be interesting!
> >
> > Brad
> >
> >
> > --
> > Brad Coffield, MLIS
> > Assistant Information and Web Services Librarian
> > Saint Francis University
> > 814-472-3315
> > [log in to unmask]
> >
>
--
Bill Dueber
Library Systems Programmer
University of Michigan Library
|