1748121 Members
3247 Online
108758 Solutions
New Discussion юеВ

Perl and Arrays?

 
SOLVED
Go to solution
Jim Mallett
Honored Contributor

Perl and Arrays?

I'm still stuck on "Hello World" as far as Perl goes, I haven't had any time to read. I was hoping somebody could give me some ideas on this one.
I have an FTP log that I need to do some calculations on. If I wrote this in VB I would read each line for a particular session into an array and then compare the variables. I want to stop using VB though.

The log looks like this (although there are 1000s of records):
01:04:57 10.10.1.12 [3476]USER ICR_USER 331
01:04:57 10.10.1.12 [3476]PASS - 230
01:04:59 10.10.1.12 [3476]sent /2003/01/181/00002568.TIF 226
01:04:59 10.10.1.12 [3476]QUIT - 226

I would want an output file that says something like:
Session : Time (Seconds)
3476 : 2 Seconds

I don't care if the example is related to this problem, does anybody have a Perl example of reading lines into arrays and comparing the variables? Can that be done? Not every record will have 4 lines as some retrievals fail.

Thanks....
Jim
Hindsight is 20/20
4 REPLIES 4
Jim Mallett
Honored Contributor

Re: Perl and Arrays?

Why do I always forget to change the category? This should be under languages rather than databases.
Hindsight is 20/20
Joerg Hinz
Occasional Advisor

Re: Perl and Arrays?

Hi Jim,

the solution isn't that simple.

You have to write serveral routines and you need two hashes (man perldsc).

One hash ("data hash") in which you read the logfile (Key=PID (as 3476 in your example), and one hash ("status hash") in which you write if one session ist finished (look for QUIT - 226 messages).

While you read the logfile create a new hash-entry in the data-hash if the PID is new, if there's already an entry append to it.

If you find an end-tag create an entry in the status-hash.

Then you've got 2 ways:
1) Check the status-hash every n lines (i.e. 1000 lines) for finished sessions
2) Call the finished session routine immediately when you found an end-session-tag.

This routine looks in the status-hash for finished sessions and processes them then in the data-hash. There you can calculate the time. After that remove the session from the data- and status-hash to prevent mixing up data with later sessions which might get the same PID again.

That's it.

The whole process is a little complicated, if your perl experience is just like "hello world", you'll have to spend some time to lern perl better to write such a program.

Regards
Joerg
-- quote?
H.Merijn Brand (procura
Honored Contributor
Solution

Re: Perl and Arrays?

Info can be stored easy in an HOH (hash of hashes = hash of anonymous hashes).

while (<>) {
my ($stamp, $ip, $sid, $key, $info) = m/^([\d:]+)\s+([\d.]+)\s+\[(\d+)\]\s*(\S+)\b\s*(.*)/;
$log{$sid}{$ip}{$key}{$stamp} = $info;
}

Now all available sessions:

my @sid = sort keys %log;

I don't know this kind of log, but maybe a session is uniuely bound to an IP, so it does not need it's own hash

$log{$sid}{$key}{$stamp} = [ $ip, $info ];

or just log them in a hash of lists

push @{$log{$sid}{$key}}, [ $stamp, $ip, $info];

Now to not store, but only calc the times:

while (<>) {
my ($stamp, $ip, $sid, $key, $info) = m/^([\d:]+)\s+([\d.]+)\s+\[(\d+)\]\s*(\S+)\b\s*(.*)/;
if ($key =~ /^quit/i) {
exists $log{$sid} or die "$sid: quit without anything to quit from";
my ($h, $m, $s) = map { $_ + 0 } split m/:/, $stamp;
my $end = (60 * $h + $m) * 60 + $s;
($h, $m, $s) = map { $_ + 0 } split m/:/, $log{$sid};
my $start = (60 * $h + $m) * 60 + $s;
$start > $end and $end += 60 * 60 * 24;
print "Session: $sid,", $end - $start, " secod(s)\n";
delete $log{$sid}; # Allow reuse of sid
}
$log{$sid} ||= $stamp;
}

Above is also safe for


01:04:57 10.10.1.12 [3476]USER ICR_USER 331
01:04:57 10.10.1.12 [3476]PASS - 230
01:04:58 10.10.1.12 [3476]sent /2003/01/181/00002568.TIF 226
01:04:58 10.10.1.13 [3477]USER ICR_USER 331
01:04:59 10.10.1.12 [3476]QUIT - 226

And if you only want to check how log a sid lasts (above example assumes that a session end with quit and that multiple sessions can mix, like shown in the modified log snippet)

my ($psid, $start, $stop) = (-2);
sub show
{
my ($h, $m, $s) = map { $_ + 0 } split m/:/, $start;
$start = (60 * $h + $m) * 60 + $s;
($h, $m, $s) = map { $_ + 0 } split m/:/, $stop || $start;
$stop = (60 * $h + $m) * 60 + $s;
$start > $stop and $stop += 60 * 60 * 24;
print "Session: $psid,", $stop - $start, " secod(s)\n";
$start = $stop = 0;
} # show
while (<>) {
my ($stamp, $ip, $sid, $key, $info) = m/^([\d:]+)\s+([\d.]+)\s+\[(\d+)\]\s*(\S+)\b\s*(.*)/;
if ($sid == $psid) {
$stop = $stamp;
next;
}
$start and show;
$start ||= $stamp;
$stop = $stamp;
}
$start and show;

Enjoy, have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Jim Mallett
Honored Contributor

Re: Perl and Arrays?

Thanks Merijn!

I had a little trouble with the output on that 3rd piece of code but the 2nd gave me exactly what I needed anyway. Now I have some performance information I can head into work with on Monday! And I have a good head start to my reading which I'm about to pick back up on now.

Good to have you back!

Jim
Hindsight is 20/20