I'd be very surprised if Google _automatically_ took any notice of
anything in an HTTP header to relax protection against what they
consider harvesting of data because all HTTP headers can be set to
anything: that is, if I wanted to suck Google dry of bib data, I
could simply pretend to be forwarding requests for "real" clients
behind a NAT barrier.
But they may well investigate such cases and configure their traffic
monitoring software for known legitimate proxies.
Kent Fitch
On Wed, Mar 19, 2008 at 3:29 AM, Joe Hourcle
<[log in to unmask]> wrote:
> On Tue, 18 Mar 2008, Jonathan Rochkind wrote:
>
> > Wait, now ALL of your clients calls are coming from one single IP?
> > Surely that will trigger Googles detectors, if the NAT did. Keep us
> > updated though.
>
> I don't know what Peter's exact implementation is, but they might relax
> the limits when they see an 'X-Forwarded-For' header, or something else to
> suggest it's coming through a proxy. It used to be pretty common when
> writing rate limiting code to use X-Forwarded-For in place of HTTP_ADDR so
> you didn't accidentally ban groups behind proxies. (of course, I don't
> know if the X-Forwarded-For value is something that's not routable (in
> 10/8), or the NAT IP, so it might still look like 1 IP address behind a
> proxy)
>
> Also, by using a caching proxy (if the responses are cachable), the total
> number of requests going to Google might be reduced.
>
> I would assume they'd need to have some consideration for proxies, as I
> remember the days when AOL's proxy servers channeled all requests through
> less than a dozen unique IP addresses. (or at least, those were the only
> ones hitting my servers)
>
> -Joe
>
|