Operating System - HP-UX
1835261 Members
2823 Online
110078 Solutions
New Discussion

Re: perl, sed, awk.. date format translation.

 
SOLVED
Go to solution
Arunvijai_4
Honored Contributor

Re: perl, sed, awk.. date format translation.

Tim,
I tried some performance testings with the data you have given in attachment.

ENV : HP-UX 11.11 (Dual @ 800Mhz) Shell :/sbin/sh Perl : v5.8.0
"time" command used to measure exec time.

Perl
====
real 2.9
user 1.6
sys 1.3

Cut
===
real 4.7
user 1.5
sys 3.4

Sed
===
real 0.9
user 0.3
sys 0.6

AWK
===
real 1.0
user 0.4
sys 0.7

"A ship in the harbor is safe, but that is not what ships are built for"
Tim D Fulford
Honored Contributor

Re: perl, sed, awk.. date format translation.

JRF... I thought you had bowed out of the forums!!! Nice to see you back, I hope all is well

Tim
-
James R. Ferguson
Acclaimed Contributor

Re: perl, sed, awk.. date format translation.

Hi Tim:

Thank you for the kind words! Just a sabbatical.

Warmest Regards!

...JRF...
Tim D Fulford
Honored Contributor

Re: perl, sed, awk.. date format translation.

My results

model...
9000/800/L1000-36
uname -a...
HP-UX terminal B.11.00 U 9000/800 551706517 unlimited-user license
processors...
Class I H/W Path Driver S/W State H/W Type Description
===================================================================
processor 0 160 processor CLAIMED PROCESSOR Processor
Lest do some tests...
Perl1....

real 0.14
user 0.01
sys 0.01

Perl2....

real 0.19
user 0.00
sys 0.00

Sed...

real 0.02
user 0.01
sys 0.01

Cut...

real 13.24
user 2.85
sys 7.53

Awk...

real 0.04
user 0.03
sys 0.01
-
Tim D Fulford
Honored Contributor

Re: perl, sed, awk.. date format translation.

oops scripts...

OK the perl scripts are good, they leave the header intact..

The awk, sed and cut scripts, unfortunately mangle the headder... I can get round this but the tests dont bother...

The scripts used is attached.

Tim
-
Tim D Fulford
Honored Contributor

Re: perl, sed, awk.. date format translation.

Arunvaiji..(anyone)

How does my L1000 beat you rp3440 (i suspect)..

With the exception of cut, my system beats yopu .... yet is is far slower????

10 points for the best explaination...

Tim
-
James R. Ferguson
Acclaimed Contributor

Re: perl, sed, awk.. date format translation.

Hi Tim:

If you are benchmarking you (ideally) need to run your timings in isolation without other processes competing for shared resources.

You need, too, to run multiple passes against the data to smooth out small sample anomolies.

For instance, measuring a process that has to read a file will probably yield a longer time the *first* time when there are no cached buffers available.

I'm not so concerned that a particular piece of code runs faster or slower on your machine versus mine as I am that I can improve speed by tweaking the code in the first place. I'm "old-school": I still (when applicable) give consideration to *which* resource I want to use most --- I/O, memory or processor. Maintainability and readability of code, though, are usually more important than squeezing the last bit of performance out.

Regards!

...JRF...
Tim D Fulford
Honored Contributor

Re: perl, sed, awk.. date format translation.

JRF.. in an ideal world I'd agree... but I just thought that a L1000 with 1x 360MHz CPU should be slower than rp3440 with 2x 800MHz CPUs... even if the rp3440 was doing a "fair" ammount of work I would still think a 30 fold decrease in performance seems a little too much...

Anyway for those of you with nicer H/W etc I've suppled the scripts and data... have fun..

Many thanks for all the replies I've seen lots of new ways of cracking the same nut.. and I hope if someone else is looking at the thread they get some ideas. I'm rearly dissapointed at the variety of ideas that get thrown up in the HP forum..

sed and cut are really quite forceful.. but just goes to show it can be done within a program (as opposed to code translator). awk & perl very good, and relatively concise... I'm going to use perl with JRF's symbol mofification for code simplicity. (The difference between 0.1 and 0.01s for the files I'm translating is irrelavent)

Angain many thanks for the replies and input... I'll leave this thread open for a few more days (ya never know)

Regards

Tim
-
Hein van den Heuvel
Honored Contributor

Re: perl, sed, awk.. date format translation.

Just to be different, some other perl approaches, all using unpack to dismantle the input.

All assume:

$old = "yyyymmddHHMM|.. other stuff here ...";

----------

$old="yyyymmddHHMM|.. other stuff here ...";

foreach (unpack("a4a2a2a2a2a*",$old)) {
$new .= $_ . (qw(/ / | :))[$i++];
}
print "$new\n";


---------- replacing foreach with a join ----


$new = join ((qw(/ / | :))[$i++], unpack("a4a2a2a2a2a*",$old));
print "$new\n";


---------- replacing array lookup by substr ----

$new = join substr("//|:",$i++,1), unpack("a4a2a2a2a2a*",$old);
print "$new\n";



fwiw,
Hein.
Arunvijai_4
Honored Contributor

Re: perl, sed, awk.. date format translation.

Tim, back to office.. The reasons are,
1) My rp3440 is used as a test machine and many applications are running and taking quite bit memory and CPUs..
2) To measure nearly exact amout of time, we need to use products like NetIQ where you can set customised counters and collect data..
3) Here is another data from rx2600 (IA64)11.23


AWK
===
real 1.08
user 0.28
sys 0.67

SED
===
real 0.89
user 0.20
sys 0.57

cut
===
real 4.73
user 1.00
sys 3.03

Perl
====
real 0.02
user 0.01
sys 0.01

You can see perl beats everything ..

-Arun
"A ship in the harbor is safe, but that is not what ships are built for"
Muthukumar_5
Honored Contributor

Re: perl, sed, awk.. date format translation.

Check resources used for perl / cut / awk / sed execution. Perl will occupy more CPU% and Memory. Use core level functionalities like while loop + cut for not using more system resources.

If resource is not a matter then use advanced concepts like perl / awk / sed.

hth.
Easy to suggest when don't know about the problem!
Tim D Fulford
Honored Contributor

Re: perl, sed, awk.. date format translation.

Good sets of results Arun.. I wondered if your rp34440 server was running lots of stufff. It must have been very busy to have posted some of the times it did, and probably goes to support Muthukumar's point about cut..

Muthukumar.. you may well be right about the resource utilisation... but it is analogous to going to work ...you can cycle to work (cut) or drive (perl). If work is only a short distance away, cycling is great and sufficient... if it is a long way away you have little choice, no matter what the efficiency savings are.... untill the overload traffic becomes so slow that it then warrents going back to the bike!!! the system I use is not heavily used by many people, so perl is great as it is the fastest.

Regards

Tim

-
Arunvijai_4
Honored Contributor

Re: perl, sed, awk.. date format translation.

Yes Tim, its better to go with Perl since its ability to run as multi threaded in HP-UX for large amount of data.

Are you still going to keep this thread open ? ;-)

-Arun
"A ship in the harbor is safe, but that is not what ships are built for"
Tim D Fulford
Honored Contributor

Re: perl, sed, awk.. date format translation.

Many thanks... case closed & buckets of points all round..

Tim
-