<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Script help. Perl perhaps ? in Operating System - Linux</title>
    <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812272#M100101</link>
    <description>Peter, I like your awk solution, which is pretty close to what I do in perl, but yours could be safer, if you would include the sep in the key&lt;BR /&gt;&lt;BR /&gt;As we were not told how the data looks like, your script would map both (12, 345, 6789) and (1, 23456, 789) to the same key.&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have FUN! H.Merijn</description>
    <pubDate>Mon, 26 Jun 2006 06:15:36 GMT</pubDate>
    <dc:creator>H.Merijn Brand (procura</dc:creator>
    <dc:date>2006-06-26T06:15:36Z</dc:date>
    <item>
      <title>Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812268#M100097</link>
      <description>&lt;!--!*#--&gt;Hi.&lt;BR /&gt;I have written the script below to do a simple lookup but now I need to do it on a much larger datafile and it will take an age.&lt;BR /&gt;I suspect perl is the way to go to make it faster, but I don't know any perl :(&lt;BR /&gt;&lt;BR /&gt;Could someone help me to translate this script please to save my server days of processing?&lt;BR /&gt;Thanks in advance.&lt;BR /&gt;&lt;BR /&gt;The script looks up each line of a data file, compares certain fields with fields from a master file, and outputs an id value from the master file along with certain fields from the data file.&lt;BR /&gt;&lt;BR /&gt;masterfile=/tmp/masterfile&lt;BR /&gt;datafile=/tmp/datafile&lt;BR /&gt;&lt;BR /&gt;for b in `cat $datafile`&lt;BR /&gt;do&lt;BR /&gt;&lt;BR /&gt; compare=`echo $b | awk 'BEGIN{FS="|"}{data = $2$5$6;print data}END{}' `&lt;BR /&gt;&lt;BR /&gt;  for a in `cat $masterfile`&lt;BR /&gt;  do&lt;BR /&gt;    mastercompare=`echo $a | awk 'BEGIN{FS="|"}{line = $2$5$6;print line}END{}'`&lt;BR /&gt;        if [ $mastercompare = $compare ]&lt;BR /&gt;        then&lt;BR /&gt;                id=`echo $a | awk 'BEGIN{FS="|"}{print $1}END{}'`&lt;BR /&gt;                output=`echo $b | awk 'BEGIN{FS="|";OFS="|"}{print $3,$7}END{}'`&lt;BR /&gt;                echo $id"|"$output &amp;gt;&amp;gt; luke.out&lt;BR /&gt;        fi&lt;BR /&gt;  done&lt;BR /&gt;&lt;BR /&gt;done</description>
      <pubDate>Mon, 26 Jun 2006 05:22:15 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812268#M100097</guid>
      <dc:creator>Luke Morgan</dc:creator>
      <dc:date>2006-06-26T05:22:15Z</dc:date>
    </item>
    <item>
      <title>Re: Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812269#M100098</link>
      <description>&lt;!--!*#--&gt;Hi,&lt;BR /&gt;&lt;BR /&gt;my suggestion reads the masterfile and the datafile only once, and puts all info in an array (untested) - this should be MUCH faster:&lt;BR /&gt;&lt;BR /&gt;awk -F'|' 'BEGIN { OFS="|";&lt;BR /&gt;while ((getline &amp;lt; "/tmp/masterfile") == 1) id[$2$5$6] = $1 }&lt;BR /&gt;{ line=$2$5$6; if(id[line]) print (id[line],$3,$7) }' /tmp/datafile &amp;gt;luke.out&lt;BR /&gt;&lt;BR /&gt;mfG Peter</description>
      <pubDate>Mon, 26 Jun 2006 05:59:32 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812269#M100098</guid>
      <dc:creator>Peter Nikitka</dc:creator>
      <dc:date>2006-06-26T05:59:32Z</dc:date>
    </item>
    <item>
      <title>Re: Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812270#M100099</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;to be clean, close the masterfile after reading:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;awk -F'|' 'BEGIN { OFS="|"&lt;BR /&gt;while ((getline &amp;lt; "/tmp/masterfile") == 1) id[$2$5$6] = $1&lt;BR /&gt;close ("/tmp/masterfile") }&lt;BR /&gt;{ line=$2$5$6; if(id[line]) print (id[line],$3,$7) }' /tmp/datafile &amp;gt;luke.out&lt;BR /&gt;&lt;BR /&gt;mfG Peter</description>
      <pubDate>Mon, 26 Jun 2006 06:01:45 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812270#M100099</guid>
      <dc:creator>Peter Nikitka</dc:creator>
      <dc:date>2006-06-26T06:01:45Z</dc:date>
    </item>
    <item>
      <title>Re: Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812271#M100100</link>
      <description>&lt;!--!*#--&gt;Yes, in this case, perl would be extremely faster.&lt;BR /&gt;&lt;BR /&gt;--8&amp;lt;---&lt;BR /&gt;#!/usr/bin/perl&lt;BR /&gt;&lt;BR /&gt;use strict;&lt;BR /&gt;use warnings;&lt;BR /&gt;&lt;BR /&gt;my $masterfile = "/tmp/masterfile";&lt;BR /&gt;my $datafile   = "/tmp/datafile";&lt;BR /&gt;&lt;BR /&gt;open my $out, "&amp;gt;", "luke.out" or die "luke.out: $!\n";&lt;BR /&gt;&lt;BR /&gt;my %master;&lt;BR /&gt;open my $mst, "&amp;lt;", $masterfile or die "$masterfile: $!\n";&lt;BR /&gt;while (&amp;lt;$mst&amp;gt;) {&lt;BR /&gt;    my @mst = split /\|/, $_;&lt;BR /&gt;    $master{join "|", @mst[1,4,5]} = [ @mst[0,2,6] ];&lt;BR /&gt;    }&lt;BR /&gt;close $mst;&lt;BR /&gt;&lt;BR /&gt;open my $dta, "&amp;lt;", $datafile or die "$datafile: $!\n";&lt;BR /&gt;while (&amp;lt;$dta&amp;gt;) {&lt;BR /&gt;    my @dta = split /\|/, $_;&lt;BR /&gt;    my $cmp = join "|", @dta[1,4,5];&lt;BR /&gt;    exists $master{$cmp} or next;&lt;BR /&gt;&lt;BR /&gt;    my $id  = $master{$cmp}[0];&lt;BR /&gt;    print $out "$id|$dta[2]|$dta[6]\n";&lt;BR /&gt;    }&lt;BR /&gt;close $dta;&lt;BR /&gt;close $out;&lt;BR /&gt;--&amp;gt;8---&lt;BR /&gt;&lt;BR /&gt;You could even gain a lot more speed if you told us the format of the fields, and change the join "|"'s to pack.&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have FUN! H.Merijn</description>
      <pubDate>Mon, 26 Jun 2006 06:06:48 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812271#M100100</guid>
      <dc:creator>H.Merijn Brand (procura</dc:creator>
      <dc:date>2006-06-26T06:06:48Z</dc:date>
    </item>
    <item>
      <title>Re: Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812272#M100101</link>
      <description>Peter, I like your awk solution, which is pretty close to what I do in perl, but yours could be safer, if you would include the sep in the key&lt;BR /&gt;&lt;BR /&gt;As we were not told how the data looks like, your script would map both (12, 345, 6789) and (1, 23456, 789) to the same key.&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have FUN! H.Merijn</description>
      <pubDate>Mon, 26 Jun 2006 06:15:36 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812272#M100101</guid>
      <dc:creator>H.Merijn Brand (procura</dc:creator>
      <dc:date>2006-06-26T06:15:36Z</dc:date>
    </item>
    <item>
      <title>Re: Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812273#M100102</link>
      <description>Thank you both very much for your suggestions.&lt;BR /&gt;I have implemented Peters script and the difference in speed is astonishing!&lt;BR /&gt;&lt;BR /&gt;FYI, the format of the data is this :&lt;BR /&gt;$2 is a four digit number&lt;BR /&gt;$5 is a two digit number&lt;BR /&gt;$6 is a single character&lt;BR /&gt;&lt;BR /&gt;Thanks again&lt;BR /&gt;&lt;BR /&gt;Luke</description>
      <pubDate>Mon, 26 Jun 2006 06:22:39 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812273#M100102</guid>
      <dc:creator>Luke Morgan</dc:creator>
      <dc:date>2006-06-26T06:22:39Z</dc:date>
    </item>
    <item>
      <title>Re: Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812274#M100103</link>
      <description>Please bear in mind that last remark from me regarding the generated keys in the awk solution!&lt;BR /&gt;&lt;BR /&gt;--8&amp;lt;--- perl with pack&lt;BR /&gt;#!/usr/bin/perl&lt;BR /&gt;&lt;BR /&gt;use strict;&lt;BR /&gt;use warnings;&lt;BR /&gt;&lt;BR /&gt;my $masterfile = "/tmp/masterfile";&lt;BR /&gt;my $datafile   = "/tmp/datafile";&lt;BR /&gt;&lt;BR /&gt;open my $out, "&amp;gt;", "luke.out" or die "luke.out: $!\n";&lt;BR /&gt;&lt;BR /&gt;my %master;&lt;BR /&gt;open my $mst, "&amp;lt;", $masterfile or die "$masterfile: $!\n";&lt;BR /&gt;while (&amp;lt;$mst&amp;gt;) {&lt;BR /&gt;    my @mst = split /\|/, $_;&lt;BR /&gt;    $master{pack "ssA", @mst[1,4,5]} = $mst[0];&lt;BR /&gt;    }&lt;BR /&gt;close $mst;&lt;BR /&gt;&lt;BR /&gt;open my $dta, "&amp;lt;", $datafile or die "$datafile: $!\n";&lt;BR /&gt;while (&amp;lt;$dta&amp;gt;) {&lt;BR /&gt;    my @dta = split /\|/, $_;&lt;BR /&gt;    my $cmp = pack "ssA", @dta[1,4,5];&lt;BR /&gt;    exists $master{$cmp} or next;&lt;BR /&gt;&lt;BR /&gt;    my $id  = $master{$cmp};&lt;BR /&gt;    print $out "$id|$dta[2]|$dta[6]\n";&lt;BR /&gt;    }&lt;BR /&gt;close $dta;&lt;BR /&gt;close $out;&lt;BR /&gt;--&amp;gt;8---&lt;BR /&gt;&lt;BR /&gt;Enjoy, Have FUN! H.Merijn</description>
      <pubDate>Mon, 26 Jun 2006 06:27:34 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812274#M100103</guid>
      <dc:creator>H.Merijn Brand (procura</dc:creator>
      <dc:date>2006-06-26T06:27:34Z</dc:date>
    </item>
    <item>
      <title>Re: Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812275#M100104</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;Procura is totally correct in his remark - to include this in my awk solution simply add the field seperator to the key:&lt;BR /&gt;&lt;BR /&gt;awk -F'|' 'BEGIN { OFS="|"&lt;BR /&gt;while ((getline &amp;lt; "/tmp/masterfile") == 1) id[$2$FS$5$FS$6] = $1&lt;BR /&gt;close ("/tmp/masterfile") }&lt;BR /&gt;{ line=$2$FS$5$FS$6; if(id[line]) print (id[line],$3,$7) }' /tmp/datafile &amp;gt;luke.out&lt;BR /&gt;&lt;BR /&gt;mfG Peter</description>
      <pubDate>Mon, 26 Jun 2006 06:57:28 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812275#M100104</guid>
      <dc:creator>Peter Nikitka</dc:creator>
      <dc:date>2006-06-26T06:57:28Z</dc:date>
    </item>
    <item>
      <title>Re: Script help. Perl perhaps ?</title>
      <link>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812276#M100105</link>
      <description>Ups,&lt;BR /&gt;&lt;BR /&gt;I do not complain about additional dollars normally :-).&lt;BR /&gt;But you work better here using&lt;BR /&gt;&lt;BR /&gt;id[$2FS$5FS$6] = $1&lt;BR /&gt;instead of&lt;BR /&gt;id[$2$FS$5$FS$6] = $1&lt;BR /&gt;&lt;BR /&gt;here ..&lt;BR /&gt;&lt;BR /&gt;mfG Peter</description>
      <pubDate>Mon, 26 Jun 2006 10:28:55 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/script-help-perl-perhaps/m-p/3812276#M100105</guid>
      <dc:creator>Peter Nikitka</dc:creator>
      <dc:date>2006-06-26T10:28:55Z</dc:date>
    </item>
  </channel>
</rss>

