Print

Print


I am trying to count downloads by page using Apache weblogs.
I have noticed that partial requests require special
treatment. If I use Chrome to fetch

    http://www.nber.org/papers/w12345.pdf

the log shows three lines. The first has return code 200 for
the entire document. The next shows return code 206 for the
first 32768 bytes and the last again shows return code 206
for the remaining bytes. Note that I only make one request to
the client computer - Chrome is doing the expansion. Firefox
does something similar. I don't know about other browsers.

If I count only log lines with return code 206 requesting the
initial bytes I will miss browsers that ask for the full
document in one request. If I add log lines with a return
code of 200, then I will double count requests like the one
described above. Can I just count all 200 responses, on the
assumtion that a request for the full document is always
made? That seems unlikely. Can I use the change in the
referring field? That seems unreliable. Is there a solution
that doesn't involve correlating actions across log lines?
That would be a lot more work.

For what it is worth, here are the 3 log lines described
above:

    dhcp-7-76.nber.org - - [05/Oct/2015:16:45:10 -0400] "GET
    /papers/w12345.pdf HTTP/1.1" 200 1952008 "-" "Mozilla/5.0
    (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)
    Chrome/45.0.2454.101 Safari/537.36" "-"

    198.71.7.76 - - [05/Oct/2015:16:45:12 -0400] "GET
    /papers/w12345.pdf HTTP/1.1" 206 32768
    "http://www.nber.org/papers/w12345.pdf" "Mozilla/5.0
    (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)
    Chrome/45.0.2454.101 Safari/537.36" "bytes 0-32767/1952008"

    198.71.7.76 - - [05/Oct/2015:16:45:12 -0400] "GET
    /papers/w12345.pdf HTTP/1.1" 206 1919240
    "http://www.nber.org/papers/w12345.pdf" "Mozilla/5.0
    (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)
    Chrome/45.0.2454.101 Safari/537.36" "bytes
    32768-1952007/1952008"

As an alternative to an answer, is there open source software capable
of creating "Counter" compatible logs from Apache logs?

Thanks
Daniel Feenberg
NBER