Interesting project! But not I had in mind. I’m looking to archive the actual pages, so I can refer to them (and possibly extract information from them).
On 13 January 2017 at 15:25:43, Schmitz Fuhrig, Lynda ([log in to unmask]) wrote:
Check out https://webrecorder.io/
Lynda Schmitz Fuhrig
Electronic Records Archivist
Digital Services Division
Smithsonian Institution Archives
Capital Gallery Building
600 Maryland Ave SW
Washington, DC 20024-2520
siarchives.si.edu <http://siarchives.si.edu/> | @SmithsonianArch
<https://twitter.com/smithsonianarch> | Facebook
<https://www.facebook.com/SmithsonianInstitutionArchives> | e-newsletter
in support of the Archives will help make more of our collections
On 1/13/17, 2:43 AM, "Code for Libraries on behalf of Alex Armstrong"
<[log in to unmask] on behalf of [log in to unmask]> wrote:
>Has anyone had to archive selected pages from a login-protected site? How
>did you do it?
>I've used the CLI tool httrack in the past for archiving sites. But in
>case, accessing the pages require logging in. There's some vague
>documentation about how to do this with httrack, but I haven't cracked it
>yet. (The instructions are better for the Windows version of the
>application, but I only have ready access to a Mac.)
>Before I go on a wild goose chase, any help would be much appreciated.
>Web Developer & Digital Strategist, AMICAL Consortium
>[log in to unmask]