Print

Print


On May 13, 2014, at 9:04 PM, Stuart Yeates wrote:

> We have been using google analytics since October 2008 and by and large we're pretty happy with it.
> 
> Recently I noticed that we're getting >100 hits a day from the "Pinterest/0.1 +http://pinterest.com/" bot which I understand is a reasonably reliable indicator of activity from that site. Much of this activity is pure-jpeg, so there is no HTML and no opportunity to execute javascript, so google analytics doesn't see it.
> 
> pinterest.com is absent from our referrer logs.
> 
> My main question is whether anyone has an easy tool to report on this kind of use of our collections?

Set your webserver logs to include user agent (I use 'combined' logs), then use:

	grep Pinterest /path/to/access/logs

You could also use any analytic tools that work directly off of your log files.  It might not have all of the info that the javascript analytics tools pull (window size, extensions installed, etc.), but it'll work for anything, not just HTML files.


> My secondary question is whether any httpd gurus have recipes for redirecting by agent string from low quality images to high quality. So when AGENT =  "Pinterest/0.1 +http://pinterest.com/" and the URL matches a pattern redirect to a different pattern. For example:
> 
> http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a%28w100%29.jpg
> 
> to
> 
> http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a.jpg


Perfectly possible w/ Apache's mod_rewrite, but you didn't say what http server you're using.

If Apache, you'd do something like:

	RewriteCond %{HTTP_USER_AGENT} ^Pinterest
	RewriteRule (^/etexts/MakOldT/.*)\(.*\)\.jpg $1.jpg [L]

You might need to adjust the regex to match your URLs ... I just assumed the stuff in parens got stripped out of stuff in that directory.