- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: Script help. Perl perhaps ?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-25-2006 10:22 PM
тАО06-25-2006 10:22 PM
I have written the script below to do a simple lookup but now I need to do it on a much larger datafile and it will take an age.
I suspect perl is the way to go to make it faster, but I don't know any perl :(
Could someone help me to translate this script please to save my server days of processing?
Thanks in advance.
The script looks up each line of a data file, compares certain fields with fields from a master file, and outputs an id value from the master file along with certain fields from the data file.
masterfile=/tmp/masterfile
datafile=/tmp/datafile
for b in `cat $datafile`
do
compare=`echo $b | awk 'BEGIN{FS="|"}{data = $2$5$6;print data}END{}' `
for a in `cat $masterfile`
do
mastercompare=`echo $a | awk 'BEGIN{FS="|"}{line = $2$5$6;print line}END{}'`
if [ $mastercompare = $compare ]
then
id=`echo $a | awk 'BEGIN{FS="|"}{print $1}END{}'`
output=`echo $b | awk 'BEGIN{FS="|";OFS="|"}{print $3,$7}END{}'`
echo $id"|"$output >> luke.out
fi
done
done
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-25-2006 10:59 PM
тАО06-25-2006 10:59 PM
Solutionmy suggestion reads the masterfile and the datafile only once, and puts all info in an array (untested) - this should be MUCH faster:
awk -F'|' 'BEGIN { OFS="|";
while ((getline < "/tmp/masterfile") == 1) id[$2$5$6] = $1 }
{ line=$2$5$6; if(id[line]) print (id[line],$3,$7) }' /tmp/datafile >luke.out
mfG Peter
- Tags:
- awk
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-25-2006 11:01 PM
тАО06-25-2006 11:01 PM
Re: Script help. Perl perhaps ?
to be clean, close the masterfile after reading:
awk -F'|' 'BEGIN { OFS="|"
while ((getline < "/tmp/masterfile") == 1) id[$2$5$6] = $1
close ("/tmp/masterfile") }
{ line=$2$5$6; if(id[line]) print (id[line],$3,$7) }' /tmp/datafile >luke.out
mfG Peter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-25-2006 11:06 PM
тАО06-25-2006 11:06 PM
Re: Script help. Perl perhaps ?
--8<---
#!/usr/bin/perl
use strict;
use warnings;
my $masterfile = "/tmp/masterfile";
my $datafile = "/tmp/datafile";
open my $out, ">", "luke.out" or die "luke.out: $!\n";
my %master;
open my $mst, "<", $masterfile or die "$masterfile: $!\n";
while (<$mst>) {
my @mst = split /\|/, $_;
$master{join "|", @mst[1,4,5]} = [ @mst[0,2,6] ];
}
close $mst;
open my $dta, "<", $datafile or die "$datafile: $!\n";
while (<$dta>) {
my @dta = split /\|/, $_;
my $cmp = join "|", @dta[1,4,5];
exists $master{$cmp} or next;
my $id = $master{$cmp}[0];
print $out "$id|$dta[2]|$dta[6]\n";
}
close $dta;
close $out;
-->8---
You could even gain a lot more speed if you told us the format of the fields, and change the join "|"'s to pack.
Enjoy, Have FUN! H.Merijn
- Tags:
- Perl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-25-2006 11:15 PM
тАО06-25-2006 11:15 PM
Re: Script help. Perl perhaps ?
As we were not told how the data looks like, your script would map both (12, 345, 6789) and (1, 23456, 789) to the same key.
Enjoy, Have FUN! H.Merijn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-25-2006 11:22 PM
тАО06-25-2006 11:22 PM
Re: Script help. Perl perhaps ?
I have implemented Peters script and the difference in speed is astonishing!
FYI, the format of the data is this :
$2 is a four digit number
$5 is a two digit number
$6 is a single character
Thanks again
Luke
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-25-2006 11:27 PM
тАО06-25-2006 11:27 PM
Re: Script help. Perl perhaps ?
--8<--- perl with pack
#!/usr/bin/perl
use strict;
use warnings;
my $masterfile = "/tmp/masterfile";
my $datafile = "/tmp/datafile";
open my $out, ">", "luke.out" or die "luke.out: $!\n";
my %master;
open my $mst, "<", $masterfile or die "$masterfile: $!\n";
while (<$mst>) {
my @mst = split /\|/, $_;
$master{pack "ssA", @mst[1,4,5]} = $mst[0];
}
close $mst;
open my $dta, "<", $datafile or die "$datafile: $!\n";
while (<$dta>) {
my @dta = split /\|/, $_;
my $cmp = pack "ssA", @dta[1,4,5];
exists $master{$cmp} or next;
my $id = $master{$cmp};
print $out "$id|$dta[2]|$dta[6]\n";
}
close $dta;
close $out;
-->8---
Enjoy, Have FUN! H.Merijn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-25-2006 11:57 PM
тАО06-25-2006 11:57 PM
Re: Script help. Perl perhaps ?
Procura is totally correct in his remark - to include this in my awk solution simply add the field seperator to the key:
awk -F'|' 'BEGIN { OFS="|"
while ((getline < "/tmp/masterfile") == 1) id[$2$FS$5$FS$6] = $1
close ("/tmp/masterfile") }
{ line=$2$FS$5$FS$6; if(id[line]) print (id[line],$3,$7) }' /tmp/datafile >luke.out
mfG Peter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО06-26-2006 03:28 AM
тАО06-26-2006 03:28 AM
Re: Script help. Perl perhaps ?
I do not complain about additional dollars normally :-).
But you work better here using
id[$2FS$5FS$6] = $1
instead of
id[$2$FS$5$FS$6] = $1
here ..
mfG Peter