I don't think I do anything sophisticated like X-forwarder-for. I just have a ProxyPass directive in the apache configuration teeling it to reverse proxy a directory to google
ProxyPass /googlebooks http://books.google.com/books
But what if Google did something with a X-forwarded-for header? It can not see where the actual user is located. Behind a NAT usually 10.0.0.0 adresses are used. In fact it is trivial what Ip adresses are used behind the NAT. Since they are not exposed to the outside world it is only relevant if they are unique within the network behind the NAT.
Anyway, since we only hit google books form the server when a user asks for display of a full record, I hardly expect that will cause the Google triggers. I suspect that the few thousand PC's within the university campus hitting Google cause the problem, which especially Google books reacts upon. (I can still search Google when Google books rejects accces from my IP adress.)
I'll keep you informed.
Drs. P.J.C. van Boheemen
Hoofd Applicatieontwikkeling en beheer - Bibliotheek Wageningen UR
Head of Application Development and Management - Wageningen University and Research Library
tel. +31 317 48 25 17 http://library.wur.nl <http://library.wur.nl/>
P Please consider the environment before printing this e-mail
Van: Code for Libraries namens Jonathan Rochkind
Verzonden: di 18-3-2008 18:48
Aan: [log in to unmask]
Onderwerp: Re: [CODE4LIB] Restricted access fo free covers from Google :)
Nice. X-Forwarded-For would also allow google to deliver availability
information suitable for the actual location of the end-user. If their
software chooses to pay attention to this. Which is the objection to
server-side API requests voiced to me by a Google person. (By proxying
everything through the server, you are essentially doing what I wanted
to do in the first place but Google told me they would not allow. Ironic
if you have more luck with that then the actual client-side AJAXy
requests that Google said they required!)
Thanks for alerting us to X-forwarded-for, that's a good idea.
Joe Hourcle wrote:
> On Tue, 18 Mar 2008, Jonathan Rochkind wrote:
>> Wait, now ALL of your clients calls are coming from one single IP?
>> Surely that will trigger Googles detectors, if the NAT did. Keep us
>> updated though.
> I don't know what Peter's exact implementation is, but they might relax
> the limits when they see an 'X-Forwarded-For' header, or something
> else to
> suggest it's coming through a proxy. It used to be pretty common when
> writing rate limiting code to use X-Forwarded-For in place of
> HTTP_ADDR so
> you didn't accidentally ban groups behind proxies. (of course, I don't
> know if the X-Forwarded-For value is something that's not routable (in
> 10/8), or the NAT IP, so it might still look like 1 IP address behind a
> Also, by using a caching proxy (if the responses are cachable), the total
> number of requests going to Google might be reduced.
> I would assume they'd need to have some consideration for proxies, as I
> remember the days when AOL's proxy servers channeled all requests through
> less than a dozen unique IP addresses. (or at least, those were the only
> ones hitting my servers)
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
rochkind (at) jhu.edu