Print

Print


On May 13, 2014, at 10:16 PM, Stuart Yeates wrote:

> On 05/14/2014 01:39 PM, Joe Hourcle wrote:
>> On May 13, 2014, at 9:04 PM, Stuart Yeates wrote:
>> 
>>> We have been using google analytics since October 2008 and by and large we're pretty happy with it.
>>> 
>>> Recently I noticed that we're getting >100 hits a day from the "Pinterest/0.1 +http://pinterest.com/" bot which I understand is a reasonably reliable indicator of activity from that site. Much of this activity is pure-jpeg, so there is no HTML and no opportunity to execute javascript, so google analytics doesn't see it.
>>> 
>>> pinterest.com is absent from our referrer logs.
>>> 
>>> My main question is whether anyone has an easy tool to report on this kind of use of our collections?
>> 
>> Set your webserver logs to include user agent (I use 'combined' logs), then use:
>> 
>> 	grep Pinterest /path/to/access/logs
>> 
>> You could also use any analytic tools that work directly off of your log files.  It might not have all of the info that the javascript analytics tools pull (window size, extensions installed, etc.), but it'll work for anything, not just HTML files.
> 
> When I visit http://www.pinterest.com/search/pins/?q=nzetc I see a whole lot of our images, but absolutely zero traffic in my log files, because those images are cached by pinterest.

You could also go the opposite route, and deny Pinterest your images, so they can't cache them.

You could either use robots.txt rules, or matching rules w/in Apache to deny their agents absolutely.

I have no idea if they'd then link straight to your images (so that you could get useful stats), or if they'd just not allow it to be used on their site at all.


-Joe