I believe that apache ProxyPass _will_ send an X-forwarded-for header
for you. But you're right that the "forwarded for" IP address will in
your case be an internal-only IP that doesn't mean anything to google,
if it's there at all. But who knows what Google's 'traffic defender'
routines do, maybe they would realize in the presence of that
x-forwarded-for not to limit you, even though the forwarded-for IP is
meaningless.
Who knows, and google probably won't say (because they don't want to
give any extra info to people maliciously trying to get around it).
Do please keep us updated on if this new solution works and prevents the
traffic-limiting defense that you were getting before. If it does, then
the question would be why, but x-forwarded-for (which I _think_
ProxyPass will send) may indeed be the answer.
Jonathan
Boheemen, Peter van wrote:
> I don't think I do anything sophisticated like X-forwarder-for. I just have a ProxyPass directive in the apache configuration teeling it to reverse proxy a directory to google
>
> ProxyPass /googlebooks http://books.google.com/books
>
> But what if Google did something with a X-forwarded-for header? It can not see where the actual user is located. Behind a NAT usually 10.0.0.0 adresses are used. In fact it is trivial what Ip adresses are used behind the NAT. Since they are not exposed to the outside world it is only relevant if they are unique within the network behind the NAT.
>
> Anyway, since we only hit google books form the server when a user asks for display of a full record, I hardly expect that will cause the Google triggers. I suspect that the few thousand PC's within the university campus hitting Google cause the problem, which especially Google books reacts upon. (I can still search Google when Google books rejects accces from my IP adress.)
> I'll keep you informed.
>
> Peter
>
>
> Drs. P.J.C. van Boheemen
> Hoofd Applicatieontwikkeling en beheer - Bibliotheek Wageningen UR
> Head of Application Development and Management - Wageningen University and Research Library
> tel. +31 317 48 25 17 http://library.wur.nl <http://library.wur.nl/>
> P Please consider the environment before printing this e-mail
>
> ________________________________
>
> Van: Code for Libraries namens Jonathan Rochkind
> Verzonden: di 18-3-2008 18:48
> Aan: [log in to unmask]
> Onderwerp: Re: [CODE4LIB] Restricted access fo free covers from Google :)
>
>
>
> Nice. X-Forwarded-For would also allow google to deliver availability
> information suitable for the actual location of the end-user. If their
> software chooses to pay attention to this. Which is the objection to
> server-side API requests voiced to me by a Google person. (By proxying
> everything through the server, you are essentially doing what I wanted
> to do in the first place but Google told me they would not allow. Ironic
> if you have more luck with that then the actual client-side AJAXy
> requests that Google said they required!)
>
> Thanks for alerting us to X-forwarded-for, that's a good idea.
>
> Jonathan
>
> Joe Hourcle wrote:
>
>> On Tue, 18 Mar 2008, Jonathan Rochkind wrote:
>>
>>
>>> Wait, now ALL of your clients calls are coming from one single IP?
>>> Surely that will trigger Googles detectors, if the NAT did. Keep us
>>> updated though.
>>>
>> I don't know what Peter's exact implementation is, but they might relax
>> the limits when they see an 'X-Forwarded-For' header, or something
>> else to
>> suggest it's coming through a proxy. It used to be pretty common when
>> writing rate limiting code to use X-Forwarded-For in place of
>> HTTP_ADDR so
>> you didn't accidentally ban groups behind proxies. (of course, I don't
>> know if the X-Forwarded-For value is something that's not routable (in
>> 10/8), or the NAT IP, so it might still look like 1 IP address behind a
>> proxy)
>>
>> Also, by using a caching proxy (if the responses are cachable), the total
>> number of requests going to Google might be reduced.
>>
>> I would assume they'd need to have some consideration for proxies, as I
>> remember the days when AOL's proxy servers channeled all requests through
>> less than a dozen unique IP addresses. (or at least, those were the only
>> ones hitting my servers)
>>
>> -Joe
>>
>>
>
> --
> Jonathan Rochkind
> Digital Services Software Engineer
> The Sheridan Libraries
> Johns Hopkins University
> 410.516.8886
> rochkind (at) jhu.edu
>
>
--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu
|