Databases
cancel
Showing results for 
Search instead for 
Did you mean: 

optimization required for this perl script(taking long time to execute) for huge data file.

SOLVED
Go to solution
kiran1977
Occasional Contributor

optimization required for this perl script(taking long time to execute) for huge data file.

open INP, "< testupc.txt " or die "testcpnnew.txt: $!";
while ($x=) {
chomp($x);
@y=split(/\|/,$x);
$value=@y[12];
$x=@y[5];
$y=@y[11];
$z=@y[10];
if ($value !=0)
{
@A=(@A,$x*$value*100);
}
if ($value==0)
{
@A=(@A,$y*$z*100);
}
}
close INP;
$i=0;
while ($i <@A)
{
print $A[$i], "\n";
$i++;
if ($A[$i]< 0){$a++}
if (($A[$i] >=0)&& ($A[$i] <10)) {$b++}
elsif (($A[$i] >=10)&&($A[$i] < 20)) {$c++}
elsif (($A[$i]>= 20)&&($A[$i]< 30)) {$d++}
elsif (($A[$i] >= 30)&&($A[$i]< 40)) {$e++}
elsif (($A[$i]>= 40)&&($A[$i]< 50)) {$f++}
elsif ($A[$i]>= 50) {$g++}
}
printf("Transaction amount frequency Less Than 0 = %d\n",$a);
printf("Transaction amount frequency between [0-9] = %d\n",$b);
printf("Transaction amount frequency between [10-19] = %d\n",$c);
printf("Transaction amount frequency between [20-29] = %d\n",$d);
printf("Transaction amount frequency between [30-39] = %d\n",$e);
printf("Transaction amount frequency between [40-49] = %d\n",$f);
printf("Transaction amount frequency between 50 and above = %d\n",$g);








6 REPLIES
Muthukumar_5
Honored Contributor

Re: optimization required for this perl script(taking long time to execute) for huge data file.

Can you post testupc.txt contents with sample lines?

I am suggesting few ways as,

a) Don't use elif. Use if directly to that.
b) $value=@y[12];
$x=@y[5];
$y=@y[11];
$z=@y[10];

instead use,

($value,$x,$y,$z)=@y[12,5,11,10];

hth.
Easy to suggest when don't know about the problem!
Derek Whigham_1
Trusted Contributor

Re: optimization required for this perl script(taking long time to execute) for huge data file.

Replace:
> if ($value !=0)
> {
> @A=(@A,$x*$value*100);
> }
> if ($value==0)
> {
> @A=(@A,$y*$z*100);
> }

with
push @A, (($value==0)?$y*$z*100:$x*$value*100);

also
use

@y=split(/\|/,$x,13); # don't split it into more than 13 parts, if you're only interested up to the 12th element


Again some details about the file would be useful
Divide and Conquer
Muthukumar_5
Honored Contributor
Solution

Re: optimization required for this perl script(taking long time to execute) for huge data file.

Try with this script.

==============
#
open INP, "testupc.txt " or die "testcpnnew.txt: $!";
while (chomp()) {

($x,$z,$y,$value)=(split (/\|/))[5,10,11,12];
push @A, (($value==0)?$y*$z*100:$x*$value*100);

}

close INP;
my $i=0;
while ($i < @A)
{

$var=$A[$i];

if ($var< 0){$a++}
if (($var >=0)&& ($var <10)) {$b++}
if (($var >=10)&&($var < 20)) {$c++}
if (($var>= 20)&&($var< 30)) {$d++}
if (($var >= 30)&&($var< 40)) {$e++}
if (($var>= 40)&&($var< 50)) {$f++}
if ($var>= 50) {$g++}

print $A[$i++], "\n";

}
printf("Transaction amount frequency Less Than 0 = %d\n",$a);
printf("Transaction amount frequency between [0-9] = %d\n",$b);
printf("Transaction amount frequency between [10-19] = %d\n",$c);
printf("Transaction amount frequency between [20-29] = %d\n",$d);
printf("Transaction amount frequency between [30-39] = %d\n",$e);
printf("Transaction amount frequency between [40-49] = %d\n",$f);
printf("Transaction amount frequency between 50 and above = %d\n",$g);
#

PS: Post your input and required output to give suitable script more.

hth.
Easy to suggest when don't know about the problem!
kiran1977
Occasional Contributor

Re: optimization required for this perl script(taking long time to execute) for huge data file.

this code works fine but, for HUGE DATA am getting out of memory. please check this code to work for huge data. for smaller data it is working fine. i have used Data size 1439261668)then am geetu=ing oyt of memory. please le me know this issue ASAP.
Great Thanks in Advance,

Rodney Hills
Honored Contributor

Re: optimization required for this perl script(taking long time to execute) for huge data file.

You don't need to read the entire file into @A. I haven't tested it, but here is another version that processes the file as it reads it.

HTH

Rod Hills

open INP, "testupc.txt " or die "testcpnnew.txt: $!";
while (chomp()) {
($x,$z,$y,$value)=(split (/\|/))[5,10,11,12];
$var=($value==0)?$y*$z*100:$x*$value*100);
print $var,"\n";
if ($var < 0) { $cnt[0]++; }
else {
$inx=($var>=0)+($var>=10)+($var>=20)+($var>=30)+($var>=40)+($var>=50);
$cnt[$inx]++;
}
}
close INP;
for $rng (0..6) {
if ($rng < 0) { $msg="Less Than 0";}
elsif ($rng == 6) { $msg="between 50 and above"; }
else { $msg=sprintf("between [%2.2d-%2.2d]",10*$rng-10,10*$rng-1); }
printf("Transaction amount frequency %20s = %d\n",$cnt[$i]);
}
There be dragons...
Rodney Hills
Honored Contributor

Re: optimization required for this perl script(taking long time to execute) for huge data file.

Correction to one line
if ($rng < 0) { $msg="Less Than 0";}

should be
if ($rng == 0) { $msg="Less Than 0";}

Rod Hills
There be dragons...