<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: join problem with awk/printf in Operating System - Linux</title>
    <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818974#M100302</link>
    <description>Harry -  I think because my data is a bit different, the sort works different, and the script gives the wrong results.  The output of your sort command is like this:&lt;BR /&gt;&lt;BR /&gt;sort -k 1 dah1 dah2                                      &lt;BR /&gt;hostnameA 0 policy_name date time&lt;BR /&gt;hostnameA HR development, Data Repository development, crazy stuff, more crazy stuff&lt;BR /&gt;hostnameB 1 bad_policy nodate never&lt;BR /&gt;hostnameB stupid stuff, more stupid stuff&lt;BR /&gt;hostnameC 2 old_policy someday sometime&lt;BR /&gt;hostnameD weird stuff, more weird stuff&lt;BR /&gt;hostnameE eerie stuff, more eerie stuff&lt;BR /&gt;hostnameZ 8 good_policy goodday goodtime&lt;BR /&gt;hostnameZ Security Respository stuff, more backup stuff&lt;BR /&gt;&lt;BR /&gt;Always the backup results file followed by the host description file.&lt;BR /&gt;&lt;BR /&gt;My sort output looks more like this, regardless of which file is specified first in the sort command:&lt;BR /&gt;&lt;BR /&gt;host1  (leading spaces)     DTP QTP&lt;BR /&gt;host1 0 STD_host1 07/10/2006 12:36:09&lt;BR /&gt;host2   (leading spaces)     BW DTP QTP&lt;BR /&gt;host2 0 STD_host2 07/11/2006 01:57:38&lt;BR /&gt;host3   (leading spaces)     Non-SAP Development&lt;BR /&gt;host3 0 STD_host3 07/10/2006 12:26:33&lt;BR /&gt;&lt;BR /&gt;Unless I can get the sort to operate the same, I think I need to move on from this task.  I thank you for all your assistance; this has been a learning experience for me!&lt;BR /&gt;&lt;BR /&gt;Scott</description>
    <pubDate>Tue, 11 Jul 2006 12:01:42 GMT</pubDate>
    <dc:creator>Scott Lindstrom_2</dc:creator>
    <dc:date>2006-07-11T12:01:42Z</dc:date>
    <item>
      <title>join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818965#M100293</link>
      <description>I have a script that outputs the result of the last backup for each host in this format:&lt;BR /&gt;&lt;BR /&gt;hostnamea  retcode  policy_name  date  time&lt;BR /&gt;&lt;BR /&gt;I now have a new requirement to join this with a file that contains the description of what runs on that host, eg :&lt;BR /&gt;&lt;BR /&gt;hostnamea        HR dev, DR dev&lt;BR /&gt;&lt;BR /&gt;Up until now, I have been successful using join, and awk with printf. But now that the second file has a freefrom 'second' field, I am having problems.  Any ideas on how I can end up with the following output (formatted with printf):&lt;BR /&gt;&lt;BR /&gt;hostnamea  retcode  policy_name  date  time HR dev, DR dev&lt;BR /&gt;&lt;BR /&gt;TIA, &lt;BR /&gt;Scott&lt;BR /&gt;</description>
      <pubDate>Thu, 06 Jul 2006 14:01:36 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818965#M100293</guid>
      <dc:creator>Scott Lindstrom_2</dc:creator>
      <dc:date>2006-07-06T14:01:36Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818966#M100294</link>
      <description>Can you post exampleS of what you mean by "freeform" ? I suspect that you mean it can have any number of words.&lt;BR /&gt;&lt;BR /&gt;live free or die&lt;BR /&gt;harry d brown jr</description>
      <pubDate>Thu, 06 Jul 2006 14:06:43 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818966#M100294</guid>
      <dc:creator>harry d brown jr</dc:creator>
      <dc:date>2006-07-06T14:06:43Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818967#M100295</link>
      <description>This phrase was an example:&lt;BR /&gt;HR dev, DR dev&lt;BR /&gt;&lt;BR /&gt;(ie, HR development, Data Repository development)&lt;BR /&gt;&lt;BR /&gt;Yes - the remainder of the line after the hostname can contain anything, including spaces and commas.  That is where my problem lies.&lt;BR /&gt;&lt;BR /&gt;Scott</description>
      <pubDate>Thu, 06 Jul 2006 14:09:41 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818967#M100295</guid>
      <dc:creator>Scott Lindstrom_2</dc:creator>
      <dc:date>2006-07-06T14:09:41Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818968#M100296</link>
      <description>If you are saying that the second line in the file contains something like this:&lt;BR /&gt;&lt;BR /&gt;hostnamea HR development, Data Repository development, crazy stuff, more crazy stuff&lt;BR /&gt;&lt;BR /&gt;then you can use "sed" or cut to grab the hostname out:&lt;BR /&gt;&lt;BR /&gt;sed:&lt;BR /&gt;sed "s/^\([A-Za-z0-9]*\) \(.*\)/\1/"&lt;BR /&gt;&lt;BR /&gt;cut:&lt;BR /&gt;cut -d" " -f1&lt;BR /&gt;&lt;BR /&gt;to grab the additional stuff use cut again:&lt;BR /&gt;&lt;BR /&gt;cut -d" " -f2-&lt;BR /&gt;&lt;BR /&gt;If you want to transform the various stings like "HR development" into "HR dev" and "Data Respository development" into "DR dev" then that poses another challenge, especially if this is a free form field that some user is typing the information into, espeically if they can't spell.&lt;BR /&gt;&lt;BR /&gt;live free or die&lt;BR /&gt;harry d brown jr</description>
      <pubDate>Thu, 06 Jul 2006 14:25:58 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818968#M100296</guid>
      <dc:creator>harry d brown jr</dc:creator>
      <dc:date>2006-07-06T14:25:58Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818969#M100297</link>
      <description>The second file is exactly as you state:&lt;BR /&gt;&lt;BR /&gt;hostnamea HR development, Data Repository development, crazy stuff, more crazy stuff&lt;BR /&gt;&lt;BR /&gt;The problem is as soon as I use join I lose formatting.  So I use awk with printf, but then I lose anything after the first word in field2 (I would only get "HR" output).  &lt;BR /&gt;&lt;BR /&gt;Basically I need to join and pipe into an awk printf when file2 has a variable number of fields.&lt;BR /&gt;&lt;BR /&gt;Here is what I'm playing with that does not work:&lt;BR /&gt;&lt;BR /&gt;join -j1 1 -j2 1 /tmp/std_backup_list3 /tmp/swinfo | awk '{printf "%-10s\t%s\t%-30s\t%s %s %-40s\n", $1, $2, $3, $4, $5, $6, $7}'&lt;BR /&gt;&lt;BR /&gt;Scott&lt;BR /&gt;</description>
      <pubDate>Thu, 06 Jul 2006 14:33:09 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818969#M100297</guid>
      <dc:creator>Scott Lindstrom_2</dc:creator>
      <dc:date>2006-07-06T14:33:09Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818970#M100298</link>
      <description>So the "joined file" has a first line contains the host name&lt;BR /&gt;and the second line contains some free form stuff, like this:&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;hostnamea&lt;BR /&gt;HR development, Data Repository development, crazy stuff, more crazy stuff&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;If this is the case, then try this:&lt;BR /&gt;&lt;BR /&gt;"join stuff here" |&lt;BR /&gt;awk ' BEGIN { firsttime = 1 }&lt;BR /&gt;{&lt;BR /&gt;  if ( firsttime == 1 ) {&lt;BR /&gt;     hostis = $0&lt;BR /&gt;     firsttime = 0&lt;BR /&gt;  } else&lt;BR /&gt;  {&lt;BR /&gt;     print hostis, $0&lt;BR /&gt;     exit&lt;BR /&gt;  }&lt;BR /&gt;}&lt;BR /&gt;'&lt;BR /&gt;&lt;BR /&gt;live free or die&lt;BR /&gt;harry d brown jr&lt;BR /&gt;&lt;BR /&gt;[root@vpart1 /var/appl/perlscripts]# ./daher  &lt;BR /&gt;hostnamea HR development, Data Repository development, crazy stuff, more crazy stuff&lt;BR /&gt;[root@vpart1 /var/appl/perlscripts]#</description>
      <pubDate>Thu, 06 Jul 2006 14:48:07 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818970#M100298</guid>
      <dc:creator>harry d brown jr</dc:creator>
      <dc:date>2006-07-06T14:48:07Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818971#M100299</link>
      <description>I was a little confused, but now I think this:&lt;BR /&gt;&lt;BR /&gt;[root@vpart1 /var/appl/perlscripts]# cat dah1&lt;BR /&gt;hostnameA 0 policy_name date time&lt;BR /&gt;hostnameB 1 bad_policy nodate never&lt;BR /&gt;hostnameC 2 old_policy someday sometime &lt;BR /&gt;hostnameZ 8 good_policy goodday goodtime&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;[root@vpart1 /var/appl/perlscripts]# cat dah2&lt;BR /&gt;hostnameA HR development, Data Repository development, crazy stuff, more crazy stuff&lt;BR /&gt;hostnameB stupid stuff, more stupid stuff&lt;BR /&gt;hostnameD weird stuff, more weird stuff&lt;BR /&gt;hostnameE eerie stuff, more eerie stuff&lt;BR /&gt;hostnameZ Security Respository stuff, more backup stuff&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;[root@vpart1 /var/appl/perlscripts]# cat daher&lt;BR /&gt;&lt;BR /&gt;sort -k 1 dah1 dah2 |&lt;BR /&gt;awk ' BEGIN { firsttime = 1 }&lt;BR /&gt;{&lt;BR /&gt;  if ( firsttime == 1 ) {&lt;BR /&gt;     std_hostis = $1&lt;BR /&gt;     std_retcode = $2&lt;BR /&gt;     std_policy_name = $3&lt;BR /&gt;     std_date = $4&lt;BR /&gt;     std_time = $5&lt;BR /&gt;     firsttime = 0&lt;BR /&gt;  } else&lt;BR /&gt;  {&lt;BR /&gt;     if ( std_hostis == $1 ) {&lt;BR /&gt;        printf "%-10s\t%s\t%-30s\t%s %s %-40s\n", std_hostis, std_retcode, std_policy_name, std_date, std_time, $0&lt;BR /&gt;        firsttime=1&lt;BR /&gt;     } else&lt;BR /&gt;     {&lt;BR /&gt;        std_hostis = $1&lt;BR /&gt;        std_retcode = $2&lt;BR /&gt;        std_policy_name = $3&lt;BR /&gt;        std_date = $4&lt;BR /&gt;        std_time = $5&lt;BR /&gt;        firsttime=0&lt;BR /&gt;     }&lt;BR /&gt;  }&lt;BR /&gt;}&lt;BR /&gt;'&lt;BR /&gt;&lt;BR /&gt;live free or die&lt;BR /&gt;harry d brown jr&lt;BR /&gt;</description>
      <pubDate>Thu, 06 Jul 2006 15:43:35 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818971#M100299</guid>
      <dc:creator>harry d brown jr</dc:creator>
      <dc:date>2006-07-06T15:43:35Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818972#M100300</link>
      <description>Harry -&lt;BR /&gt;&lt;BR /&gt;That looks like what I need!  Let me give it a try and let you know.&lt;BR /&gt;&lt;BR /&gt;Thanks!&lt;BR /&gt;&lt;BR /&gt;Scott</description>
      <pubDate>Thu, 06 Jul 2006 15:49:31 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818972#M100300</guid>
      <dc:creator>Scott Lindstrom_2</dc:creator>
      <dc:date>2006-07-06T15:49:31Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818973#M100301</link>
      <description>&lt;!--!*#--&gt;IMHO you need not use join or printf to get the proper formatting. Try the awk construct below, it does what you're trying to accomplish.&lt;BR /&gt;&lt;BR /&gt;The file containing "hostnamea retcode policy_name date time" must precede the file containing "hostnamea HR dev, DR dev", otherwise the output will be...&lt;BR /&gt;      "hostnamea HR dev, DR dev retcode policy_name date time"&lt;BR /&gt;instead of...&lt;BR /&gt;      "hostnamea retcode policy_name date time HR dev, DR dev"&lt;BR /&gt;&lt;BR /&gt;===============================================&lt;BR /&gt;awk '{&lt;BR /&gt;if(x[$1]=="")&lt;BR /&gt;    x[$1]=$0&lt;BR /&gt;else&lt;BR /&gt;    for(i=2;i&amp;lt;=NF;++i)&lt;BR /&gt;        x[$1]=x[$1]" "$i&lt;BR /&gt;} END{for(i in x) print x[i]}' firstfile secondfile&lt;BR /&gt;===============================================&lt;BR /&gt;~hope it helps</description>
      <pubDate>Thu, 06 Jul 2006 17:10:18 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818973#M100301</guid>
      <dc:creator>Sandman!</dc:creator>
      <dc:date>2006-07-06T17:10:18Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818974#M100302</link>
      <description>Harry -  I think because my data is a bit different, the sort works different, and the script gives the wrong results.  The output of your sort command is like this:&lt;BR /&gt;&lt;BR /&gt;sort -k 1 dah1 dah2                                      &lt;BR /&gt;hostnameA 0 policy_name date time&lt;BR /&gt;hostnameA HR development, Data Repository development, crazy stuff, more crazy stuff&lt;BR /&gt;hostnameB 1 bad_policy nodate never&lt;BR /&gt;hostnameB stupid stuff, more stupid stuff&lt;BR /&gt;hostnameC 2 old_policy someday sometime&lt;BR /&gt;hostnameD weird stuff, more weird stuff&lt;BR /&gt;hostnameE eerie stuff, more eerie stuff&lt;BR /&gt;hostnameZ 8 good_policy goodday goodtime&lt;BR /&gt;hostnameZ Security Respository stuff, more backup stuff&lt;BR /&gt;&lt;BR /&gt;Always the backup results file followed by the host description file.&lt;BR /&gt;&lt;BR /&gt;My sort output looks more like this, regardless of which file is specified first in the sort command:&lt;BR /&gt;&lt;BR /&gt;host1  (leading spaces)     DTP QTP&lt;BR /&gt;host1 0 STD_host1 07/10/2006 12:36:09&lt;BR /&gt;host2   (leading spaces)     BW DTP QTP&lt;BR /&gt;host2 0 STD_host2 07/11/2006 01:57:38&lt;BR /&gt;host3   (leading spaces)     Non-SAP Development&lt;BR /&gt;host3 0 STD_host3 07/10/2006 12:26:33&lt;BR /&gt;&lt;BR /&gt;Unless I can get the sort to operate the same, I think I need to move on from this task.  I thank you for all your assistance; this has been a learning experience for me!&lt;BR /&gt;&lt;BR /&gt;Scott</description>
      <pubDate>Tue, 11 Jul 2006 12:01:42 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818974#M100302</guid>
      <dc:creator>Scott Lindstrom_2</dc:creator>
      <dc:date>2006-07-11T12:01:42Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818975#M100303</link>
      <description>Hi Scott,&lt;BR /&gt;&lt;BR /&gt;I'm inclined to pursue a wee bit more owing to the intriguing nature of the problem and because imho i think i'ave finally hit the nail on the head :)&lt;BR /&gt;&lt;BR /&gt;1. sort each of the files individually on the first field&lt;BR /&gt;# sort -k1,1 /tmp/std_backup_list3 &amp;gt; /tmp/std_backup_list3.out&lt;BR /&gt;# sort -k1,1 /tmp/swinfo &amp;gt; /tmp/swinfo.out&lt;BR /&gt;&lt;BR /&gt;2. join the sorted output files from above into a single output file&lt;BR /&gt;# join -1 1 -2 1 /tmp/std_backup_list3.out /tmp/swinfo.out &amp;gt; /tmp/all.out&lt;BR /&gt;&lt;BR /&gt;~cheers</description>
      <pubDate>Tue, 11 Jul 2006 15:04:02 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818975#M100303</guid>
      <dc:creator>Sandman!</dc:creator>
      <dc:date>2006-07-11T15:04:02Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818976#M100304</link>
      <description>have you tried just using a different field separator to do the join?&lt;BR /&gt;for example:&lt;BR /&gt;sed 's/ /|/g' file1 &amp;gt; file1a&lt;BR /&gt;sed 's/ /|/'  file2 &amp;gt; file2a&lt;BR /&gt;join -t"|" file1a file2a | tr '|' ' '</description>
      <pubDate>Tue, 11 Jul 2006 16:30:41 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818976#M100304</guid>
      <dc:creator>Greg Vaidman</dc:creator>
      <dc:date>2006-07-11T16:30:41Z</dc:date>
    </item>
    <item>
      <title>Re: join problem with awk/printf</title>
      <link>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818977#M100305</link>
      <description>&lt;!--!*#--&gt;Here is an other approach, similar to Sandman's...&lt;BR /&gt;&lt;BR /&gt;It treats s.txt as a reference file to 'cross' with.&lt;BR /&gt;&lt;BR /&gt;The file b.txt is that backup log.&lt;BR /&gt;&lt;BR /&gt;Awk does all the work, by storing records from the software file in an associative array.&lt;BR /&gt;&lt;BR /&gt;No need to sort... the data will be in the backup log order:&lt;BR /&gt;&lt;BR /&gt;C:\Temp&amp;gt;type s.txt&lt;BR /&gt;hostnameA HR development, Data Repository development, crazy stuff, more crazy stuff&lt;BR /&gt;hostnameB stupid stuff, more stupid stuff&lt;BR /&gt;hostnameD weird stuff, more weird stuff&lt;BR /&gt;hostnameE eerie stuff, more eerie stuff&lt;BR /&gt;hostnameZ Security Respository stuff, more backup stuff&lt;BR /&gt;&lt;BR /&gt;C:\Temp&amp;gt;type b.txt&lt;BR /&gt;hostnameA 0 policy_name date time&lt;BR /&gt;hostnameZ 8 good_policy goodday goodtime&lt;BR /&gt;hostnameB 1 bad_policy nodate never&lt;BR /&gt;hostnameC 2 old_policy someday sometime&lt;BR /&gt;&lt;BR /&gt;C:\Temp&amp;gt;awk 'NR==FNR {key=$1; sub(key,""); S[key]=$0}&lt;BR /&gt;NR!=FNR {printf "%-10s\t%s\t%-30s\t%s \n", $1, $2, $3, $4, $5, S[$1]}' s.txt b.txt&lt;BR /&gt;&lt;BR /&gt;hostnameA       0       policy_name                     date time  HR development, Data Repository development, crazy stuff, more cr&lt;BR /&gt;azy stuff&lt;BR /&gt;hostnameZ       8       good_policy                     goodday goodtime  Security Respository stuff, more backup stuff&lt;BR /&gt;hostnameB       1       bad_policy                      nodate never  stupid stuff, more stupid stuff&lt;BR /&gt;hostnameC       2       old_policy                      someday sometime&lt;BR /&gt;&lt;BR /&gt;The awk script decides from which file the data is by comparing the current line number NR with the line in current file number FNR. If they are the same, then it is the first file.&lt;BR /&gt;&lt;BR /&gt;fwiw,&lt;BR /&gt;Hein.&lt;BR /&gt;</description>
      <pubDate>Tue, 11 Jul 2006 23:07:28 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-linux/join-problem-with-awk-printf/m-p/3818977#M100305</guid>
      <dc:creator>Hein van den Heuvel</dc:creator>
      <dc:date>2006-07-11T23:07:28Z</dc:date>
    </item>
  </channel>
</rss>

