LISTSERV 16.5 - CODE4LIB Archives

Thanks everyone! This was very interesting (and at various points quite
amusing). You know how sometimes you don't know you need something until
you learn about the possibilities? Well, I still don't need web scraping
lol but I'll definitely be saving this thread for the future because thanks
to you all I now know better what's possible and in what situations it
might be the best tool. I hope it was helpful to others as well.

Brad

(hope others continue to share their stories too..)

On Wed, Nov 29, 2017 at 11:27 AM, Charles Ed Hill <[log in to unmask]>
wrote:

> I scrape data from a large book retailer we sometimes order from to make
> our acquisitions workflow a bit easier. In particular, I ask people to make
> wishlists and scrape those (I can do individual items too, but the large
> retailer doesn't like that, even though we're trying to purchase from
> them), check those titles against our holdings, then line them up with some
> info in a spreadsheet for people. It is NOT the way I would suggest going
> about things if you can help it, as pages change frequently, large
> retailers block IP addresses, large retailers time out a lot, and so on,
> but when up against a rock and a hard place.
>
> Not for libraries, but I also recently had to put together a scraper when
> looking for daycares in Massachusetts as the page listing daycares was
> helpful but really, really clunky. Saved me quite a bit of sanity.
>
> On Wed, Nov 29, 2017 at 8:50 AM, Ross Singer <[log in to unmask]>
> wrote:
>
> > Due the absence of APIs, we have to scrape III WebBridge and EBSCO
> > LinkSource link resolver results to determine electronic holdings for
> > things.
> >
> > Neither of them make it particularly easy, since they don't provide many
> > semantic clues in the markup as to what you're looking at and there are
> all
> > kinds of other conditions you have to account for (e.g. direct linking to
> > certain sources, etc.).
> >
> > It's generally one of those things I avoid at all costs since the pages
> you
> > want/need to scrape are the most likely to be the most frustrating to
> work
> > with.
> >
> > -Ross.
> >
> > On Tue, Nov 28, 2017 at 1:26 PM Brad Coffield <
> [log in to unmask]
> > >
> > wrote:
> >
> > > I think there's likely a lot of possibilities out there and was hoping
> to
> > > hear examples of web scraping for libraries. Your example might just
> > > inspire me or another reader to do something similar. At the very
> least,
> > > the ideas will be interesting!
> > >
> > > Brad
> > >
> > >
> > > --
> > > Brad Coffield, MLIS
> > > Assistant Information and Web Services Librarian
> > > Saint Francis University
> > > 814-472-3315 <(814)%20472-3315>
> > > [log in to unmask]
> > >
> >
>



-- 
Brad Coffield, MLIS
Assistant Information and Web Services Librarian
Saint Francis University
814-472-3315
[log in to unmask]