[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Omaha.pm] Another 10m ad-hoc report

To: "Perl Mongers of Omaha, Nebraska USA" <omaha-pm@pm.org>
Subject: [Omaha.pm] Another 10m ad-hoc report
From: "Jay Hannah" <jhannah@omnihotels.com>
Date: Fri, 14 Jul 2006 09:29:40 -0500
Delivered-to: mailman-omaha-pm@mailman.pm.dev
Delivered-to: omaha-pm@pm.org
List-archive: <http://mail.pm.org/pipermail/omaha-pm>
List-help: <mailto:omaha-pm-request@pm.org?subject=help>
List-id: "Perl Mongers of Omaha, Nebraska USA" <omaha-pm.pm.org>
List-post: <mailto:omaha-pm@pm.org>
List-subscribe: <http://mail.pm.org/mailman/listinfo/omaha-pm>, <mailto:omaha-pm-request@pm.org?subject=subscribe>
List-unsubscribe: <http://mail.pm.org/mailman/listinfo/omaha-pm>, <mailto:omaha-pm-request@pm.org?subject=unsubscribe>
Reply-to: "Perl Mongers of Omaha, Nebraska USA" <omaha-pm@pm.org>
Thread-index: AcanUffqOWVOj9KnSseW7c5quuGV3w==
Thread-topic: Another 10m ad-hoc report

I love that it takes longer to explain what I'm doing and why then to
actually do it in Perl. :)

The Swiss army chainsaw of text processing, baby. :)

j


Project:

Given a file that looks like this:

2006-07-14 09:12:59|97036502|NYCBER|GNRSPE|1170245141
2006-07-14 09:12:59|97036503|CRPBFT|GNRSPE|1450000001
   CRPBFT|GNRSPE|1450000001|L||2007173547||DMC|2006-07-14
09:17:08.27300|0|0|PROCRPBFTACT-2007173547ITN-6COD-12PMFRD-2006071400000
0TOD-20060716000000AMT-0STA-A

1) Ignore all lines that don't start with "2006"
2) Ignore all lines that don't contain "GRMSTR"
3) In the remaining lines:
   Column 1 (counting from 0) is "prop".
   Column 4 (counting from 0) is "message_grp".
   Per prop, tell me the number of lines, and the number of unique
message_grp's.


Solution:

$ cat j.pl

while (<>) {
   next unless (/^2006/);
   next unless (/GRMSTR/);
   @l = split /\|/;
   $count{$l[2]}{keys}{$l[4]} = 1;
   $count{$l[2]}{lines}++;
}

foreach $prop (sort keys %count) {
   my $lines = $count{$prop}{lines};
   my $keys  = scalar(keys %{$count{$prop}{keys}});
   print "$prop sent $lines GRMSTR records containing $keys unique
message_grp's\n";
}


Result:

$ cat libqumv.log | perl j.pl
ATLCNN sent 37 GRMSTR records containing 37 unique message_grp's
AUSCTR sent 28 GRMSTR records containing 28 unique message_grp's
...etc...

Follow-Ups:
- Re: [Omaha.pm] Another 10m ad-hoc report
  - From: Andy Lester <andy@petdance.com>

Prev by Date: [Omaha.pm] perl -pi -e 's/umm_tcp/test_tcp/g' *.4gl
Next by Date: Re: [Omaha.pm] Another 10m ad-hoc report
Previous by thread: [Omaha.pm] perl -pi -e 's/umm_tcp/test_tcp/g' *.4gl
Next by thread: Re: [Omaha.pm] Another 10m ad-hoc report
Index(es):
- Date
- Thread