[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Omaha.pm] IIS server log analysis



From the Why Didn't I Post This Yesterday To Let Someone Else Do My Homework For Me Department

-----------------
PROBLEM
-----------------

Given a directory of .zip files:

ex050220.zip
ex050221.zip
ex050222.zip
ex050223.zip
ex050224.zip
ex050225.zip
ex050226.zip

Containing IIS server logs like this:

# Fields: date time c-ip cs-username s-ip s-port cs-method cs-uri-stem cs-uri-query sc-status sc-bytes cs(User-Agent) cs(Cookie) cs(Referer) 2005-02-20 00:00:00 68.60.191.239 - 198.64.145.249 443 GET /images/header/tnd_sg_07-over.gif - 304 163 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) ASPSESSIONIDSCSSSCRD=OAMAHBPAJHHBGEJEKBALFCOO https://ssl.omnihotels.com/Omni? prop=CHIDTN&pagedst=AvailReq&pagesrc=Hotels
...

Report the total number of bytes per hour transferred from port 80 and port 443 like so:

Year to hour  Port 80   Port 443
------------- --------- -----------
2005-02-20-00 208867846 31587703
2005-02-20-01 193477261 25950887
2005-02-20-02 210614224 24952027
...

-----------------
SOLUTION
-----------------

for (20 .. 26) {
   # Shooting for: ex050220.log
   $file = sprintf("ex0502%d", $_);
   `unzip $file.zip`;
   readfile("$file.log");
   unlink("$file.log");
}

sub readfile {
   my ($file) = @_;
   my %stats;
   open (IN, $file);
   while (<IN>) {
      next if /^#/;
      my @l = split / /;
      $hour = $l[1];
      $hour =~ s/:.*//;
      $stats{"$l[0]-$hour"}{$l[5]} += $l[10];
      #$cnt++;
      #last if ($cnt == 500);
   }
   close IN;

   foreach (sort keys %stats) {
      print "$_ $stats{$_}{80} $stats{$_}{443}\n";
   }
}