Print

Print


On Tue, Nov 5, 2013 at 12:07 PM, William Denton <[log in to unmask]> wrote:

>
> (Question:  Why does HTTPS complicate screen-scraping?  Every decent tool
> and library supports HTTPS, doesn't it?)
>

Birkin asked me this same question, and I realized I should clarify what I
meant.  I was mostly referring to existing screen scrapers/existing web
sites.  If you redirect every request from http to https, this will
probably break things.  I think the Open Library example that Karen
mentioned is a good case study.

And it's pretty different for a library or tool to support HTTPS and a
specific app to be expecting it.  If you follow the thread around that OL
change, it appears there are issues with Java (as one example) arbitrarily
consuming HTTPS (from what I understand, you need to have the cert
locally?), but I don't know enough about it to say for certain.  I think
there would also probably be potential issues around mashups (AJAX, for
example), but seeing as code4lib.org doesn't support CORS, not really a
current issue.  Does apply more generally to your question about library
websites at large, though.

Anyway, I agree with you that the option for both should be there.  I'm not
just not convinced that HTTPS-all-the-time is necessary for all web use
cases.

-Ross.