Operating System - Linux
1752726 Members
5723 Online
108789 Solutions
New Discussion юеВ

Re: join problem with awk/printf

 
SOLVED
Go to solution
Sandman!
Honored Contributor

Re: join problem with awk/printf

Hi Scott,

I'm inclined to pursue a wee bit more owing to the intriguing nature of the problem and because imho i think i'ave finally hit the nail on the head :)

1. sort each of the files individually on the first field
# sort -k1,1 /tmp/std_backup_list3 > /tmp/std_backup_list3.out
# sort -k1,1 /tmp/swinfo > /tmp/swinfo.out

2. join the sorted output files from above into a single output file
# join -1 1 -2 1 /tmp/std_backup_list3.out /tmp/swinfo.out > /tmp/all.out

~cheers
Greg Vaidman
Respected Contributor

Re: join problem with awk/printf

have you tried just using a different field separator to do the join?
for example:
sed 's/ /|/g' file1 > file1a
sed 's/ /|/' file2 > file2a
join -t"|" file1a file2a | tr '|' ' '
Hein van den Heuvel
Honored Contributor

Re: join problem with awk/printf

Here is an other approach, similar to Sandman's...

It treats s.txt as a reference file to 'cross' with.

The file b.txt is that backup log.

Awk does all the work, by storing records from the software file in an associative array.

No need to sort... the data will be in the backup log order:

C:\Temp>type s.txt
hostnameA HR development, Data Repository development, crazy stuff, more crazy stuff
hostnameB stupid stuff, more stupid stuff
hostnameD weird stuff, more weird stuff
hostnameE eerie stuff, more eerie stuff
hostnameZ Security Respository stuff, more backup stuff

C:\Temp>type b.txt
hostnameA 0 policy_name date time
hostnameZ 8 good_policy goodday goodtime
hostnameB 1 bad_policy nodate never
hostnameC 2 old_policy someday sometime

C:\Temp>awk 'NR==FNR {key=$1; sub(key,""); S[key]=$0}
NR!=FNR {printf "%-10s\t%s\t%-30s\t%s \n", $1, $2, $3, $4, $5, S[$1]}' s.txt b.txt

hostnameA 0 policy_name date time HR development, Data Repository development, crazy stuff, more cr
azy stuff
hostnameZ 8 good_policy goodday goodtime Security Respository stuff, more backup stuff
hostnameB 1 bad_policy nodate never stupid stuff, more stupid stuff
hostnameC 2 old_policy someday sometime

The awk script decides from which file the data is by comparing the current line number NR with the line in current file number FNR. If they are the same, then it is the first file.

fwiw,
Hein.