Databases
cancel
Showing results for
Did you mean:

## optimization required for this perl script(taking long time to execute) for huge data file.

SOLVED
Go to solution
Occasional Contributor

## optimization required for this perl script(taking long time to execute) for huge data file.

open INP, "< testupc.txt " or die "testcpnnew.txt: \$!";
while (\$x=) {
chomp(\$x);
@y=split(/\|/,\$x);
\$value=@y[12];
\$x=@y[5];
\$y=@y[11];
\$z=@y[10];
if (\$value !=0)
{
@A=(@A,\$x*\$value*100);
}
if (\$value==0)
{
@A=(@A,\$y*\$z*100);
}
}
close INP;
\$i=0;
while (\$i <@A)
{
print \$A[\$i], "\n";
\$i++;
if (\$A[\$i]< 0){\$a++}
if ((\$A[\$i] >=0)&& (\$A[\$i] <10)) {\$b++}
elsif ((\$A[\$i] >=10)&&(\$A[\$i] < 20)) {\$c++}
elsif ((\$A[\$i]>= 20)&&(\$A[\$i]< 30)) {\$d++}
elsif ((\$A[\$i] >= 30)&&(\$A[\$i]< 40)) {\$e++}
elsif ((\$A[\$i]>= 40)&&(\$A[\$i]< 50)) {\$f++}
elsif (\$A[\$i]>= 50) {\$g++}
}
printf("Transaction amount frequency Less Than 0 = %d\n",\$a);
printf("Transaction amount frequency between [0-9] = %d\n",\$b);
printf("Transaction amount frequency between [10-19] = %d\n",\$c);
printf("Transaction amount frequency between [20-29] = %d\n",\$d);
printf("Transaction amount frequency between [30-39] = %d\n",\$e);
printf("Transaction amount frequency between [40-49] = %d\n",\$f);
printf("Transaction amount frequency between 50 and above = %d\n",\$g);

6 REPLIES
Honored Contributor

## Re: optimization required for this perl script(taking long time to execute) for huge data file.

Can you post testupc.txt contents with sample lines?

I am suggesting few ways as,

a) Don't use elif. Use if directly to that.
b) \$value=@y[12];
\$x=@y[5];
\$y=@y[11];
\$z=@y[10];

(\$value,\$x,\$y,\$z)=@y[12,5,11,10];

hth.
Easy to suggest when don't know about the problem!
Trusted Contributor

## Re: optimization required for this perl script(taking long time to execute) for huge data file.

Replace:
> if (\$value !=0)
> {
> @A=(@A,\$x*\$value*100);
> }
> if (\$value==0)
> {
> @A=(@A,\$y*\$z*100);
> }

with
push @A, ((\$value==0)?\$y*\$z*100:\$x*\$value*100);

also
use

@y=split(/\|/,\$x,13); # don't split it into more than 13 parts, if you're only interested up to the 12th element

Again some details about the file would be useful
Divide and Conquer
Honored Contributor
Solution

## Re: optimization required for this perl script(taking long time to execute) for huge data file.

Try with this script.

==============
#
open INP, "testupc.txt " or die "testcpnnew.txt: \$!";
while (chomp()) {

(\$x,\$z,\$y,\$value)=(split (/\|/))[5,10,11,12];
push @A, ((\$value==0)?\$y*\$z*100:\$x*\$value*100);

}

close INP;
my \$i=0;
while (\$i < @A)
{

\$var=\$A[\$i];

if (\$var< 0){\$a++}
if ((\$var >=0)&& (\$var <10)) {\$b++}
if ((\$var >=10)&&(\$var < 20)) {\$c++}
if ((\$var>= 20)&&(\$var< 30)) {\$d++}
if ((\$var >= 30)&&(\$var< 40)) {\$e++}
if ((\$var>= 40)&&(\$var< 50)) {\$f++}
if (\$var>= 50) {\$g++}

print \$A[\$i++], "\n";

}
printf("Transaction amount frequency Less Than 0 = %d\n",\$a);
printf("Transaction amount frequency between [0-9] = %d\n",\$b);
printf("Transaction amount frequency between [10-19] = %d\n",\$c);
printf("Transaction amount frequency between [20-29] = %d\n",\$d);
printf("Transaction amount frequency between [30-39] = %d\n",\$e);
printf("Transaction amount frequency between [40-49] = %d\n",\$f);
printf("Transaction amount frequency between 50 and above = %d\n",\$g);
#

PS: Post your input and required output to give suitable script more.

hth.
Easy to suggest when don't know about the problem!
Occasional Contributor

## Re: optimization required for this perl script(taking long time to execute) for huge data file.

this code works fine but, for HUGE DATA am getting out of memory. please check this code to work for huge data. for smaller data it is working fine. i have used Data size 1439261668)then am geetu=ing oyt of memory. please le me know this issue ASAP.

Honored Contributor

## Re: optimization required for this perl script(taking long time to execute) for huge data file.

You don't need to read the entire file into @A. I haven't tested it, but here is another version that processes the file as it reads it.

HTH

Rod Hills

open INP, "testupc.txt " or die "testcpnnew.txt: \$!";
while (chomp()) {
(\$x,\$z,\$y,\$value)=(split (/\|/))[5,10,11,12];
\$var=(\$value==0)?\$y*\$z*100:\$x*\$value*100);
print \$var,"\n";
if (\$var < 0) { \$cnt[0]++; }
else {
\$inx=(\$var>=0)+(\$var>=10)+(\$var>=20)+(\$var>=30)+(\$var>=40)+(\$var>=50);
\$cnt[\$inx]++;
}
}
close INP;
for \$rng (0..6) {
if (\$rng < 0) { \$msg="Less Than 0";}
elsif (\$rng == 6) { \$msg="between 50 and above"; }
else { \$msg=sprintf("between [%2.2d-%2.2d]",10*\$rng-10,10*\$rng-1); }
printf("Transaction amount frequency %20s = %d\n",\$cnt[\$i]);
}
There be dragons...
Honored Contributor

## Re: optimization required for this perl script(taking long time to execute) for huge data file.

Correction to one line
if (\$rng < 0) { \$msg="Less Than 0";}

should be
if (\$rng == 0) { \$msg="Less Than 0";}

Rod Hills
There be dragons...