Reply to comment

Calculating bandwidth from a combined-format web server log

Given a combined web server access log, such as the ones generated by Apache, it can be useful to know the total amount of data transfer of all requests in that log. This task is simple: extract the field listing the number of bytes sent for a request, and add them all up. For something so simple, there is an odd lack of examples or pre-made scripts that do this. Or, at least, I couldn’t find any.

I wrote my solution, calculate-data-transfer.py, in Python:

import re
import sys

fileName = sys.argv[1]

compiledExpression = re.compile(".*\".*\" [-0-9]* ([0-9]*)")

fpFullLog = file(fileName)

totalBytes = 0

for line in fpFullLog:
  matches = compiledExpression.match(line)

  if matches is None:
    continue

  bytes = matches.group(1)

  if len(bytes) > 0: # avoid zero-length matches
    bytes = int(bytes)
    totalBytes += bytes

fpFullLog.close()

print "%.2f MiB" % (totalBytes/2.0**20)

Use is simple:

% python calculate-data-transfer.py access.log

The script will print out the data transfer in MiB, based on the power of 2 (2^20) rather than 10 (10^6).

Trackback URL for this post:

http://samat.org/trackback/101

Reply

The content of this field is kept private and will not be shown publicly.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • Insert Flickr images: [flickr-photo:id=230452326,size=s] or [flickr-photoset:id=72157594262419167,size=m].
  • You may use [inline:xx] tags to display uploaded files or images inline.
  • You can use Markdown syntax to format and style the text. Also see Markdown Extra for tables, footnotes, and more.

More information about formatting options