I don’t know how well this applies to your specific use of screen-scraping, but for libraries’ broader use of crawlers to build archives, the Section 108 Study Group Recommendations are a good source of guidance (though not law). They propose specific copyright exceptions for libraries in regard to collecting and archiving “publicly accessible online content”. Their recommendations are clear & sensible… they run from page 80-87 of the report.
http://www.section108.gov/docs/Sec108StudyGroupReport.pdf
Tracy Seneca
California Digital Library
________________________________________
From: Code for Libraries [[log in to unmask]] on behalf of Nate Hill [[log in to unmask]]
Sent: Sunday, October 02, 2011 7:23 PM
To: [log in to unmask]
Subject: [CODE4LIB] screen scraping
A question: what are the 'rules' around screen scraping?
If one site doesn't offer an RSS feed and you want to grab (for example)
their weekly top ten list with a script and then redisplay it on another
site, is that bad form? Or even illegal?
Thanks-
Nate
--
Nate Hill
[log in to unmask]
http://www.natehill.net
|