Help needed in scripting

Kenneth_18 · ‎10-12-2004

Here is what I want to accomplish. I've been thinking hard but I just can't get an idea on how to start the script so maybe a sample script or hint will send me on my way.

Here is the problem.

One file contains an ASCII table as shown below:

file1
xx dd xx xx
11 aa 22 33
xx cc xx xx
77 bb 88 99
xx gg xx xx

another file contains another ASCII table as shown below:

file2

xx xx cc xx
xx xx jj xx
xx xx bb xx
xx xx dd xx
44 55 aa 66

What I want to achieve is to combine bothfiles into one table with each lines in each table combined that has the same values in the 2nd column and third columns of file1 and file2 respectively. The unmatch data are written to another separate error logfile. Thus the result should be:

aa 11 22 33 44 55 66
bb 77 88 99 xx xx xx
cc xx xx xx xx xx xx

the error log file should contain something like below:

gg xx xx xx
jj xx xx xx

Thanks in advance for the help!

H.Merijn Brand (procura · ‎10-12-2004

# man join

Enjoy, Have FUN! H.Merijn

Enjoy, Have FUN! H.Merijn

Hein van den Heuvel · ‎10-12-2004

If you cat SORT the input first, then this becomes a one liner with JOIN:

$ sort -k 2 x > a
$ sort -k 3 xx > aa
$ cat a
11 aa 22 33
77 bb 88 99
xx cc xx xx
xx dd xx xx
xx gg xx xx
$ cat aa
44 55 aa 66
xx xx bb xx
xx xx cc xx
xx xx dd xx
xx xx jj xx
$ join -j1 2 -j2 3 -o 1.2 1.1 1.3 1.4 2.1 2.2 2.4 a aa
aa 11 22 33 44 55 66
bb 77 88 99 xx xx xx
cc xx xx xx xx xx xx
dd xx xx xx xx xx xx

$ join -j1 2 -j2 3 -v 1 a aa
gg xx xx xx
$ join -j1 2 -j2 3 -v 2 a aa
jj xx xx xx

Hein.

Muthukumar_5 · ‎10-12-2004

We can do this with sort / paste / awk as,

sort to sort based on field 2 on file1 and filed 3 on file2

# sort -k 2,2 file1 > file1.sort
# sort -k 3,3 file2 > file2.sort

# paste -d " " file1.sort file2.sort | awk '{ print $2" "$1" "$3" "$4" "$5" "$6" "$8 }'

aa 11 22 33 44 55 66
bb 77 88 99 xx xx xx
cc xx xx xx xx xx xx
dd xx xx xx xx xx xx
gg xx xx xx xx xx xx

HTH.

Easy to suggest when don't know about the problem!

harry d brown jr · ‎10-12-2004

I like the sort and join commands myself, but maybe you need to learn some perl:

#!/usr/bin/perl
#
$collected = "";
$errs_collected = "";
#
open(firstFILEPTR, "< ./testa") ||
die "can not open ./testa for input? check your current directory permissions and space availability!";
while () {
chomp;
($part1,$part2,$part3,$part4) = split(/ /, $_);
$collected{$part2}{"data"} = sprintf("%s %s %s %s",$part2,$part1,$part3,$part4);
$collected{$part2}{"cnt"} = 4;
}
#
close (firstFILEPTR);
#
open(secondFILEPTR, "< ./testb") ||
die "can not open ./testb for input? check your current directory permissions and space availability!";
while () {
chomp;
($part1,$part2,$part3,$part4) = split(/ /, $_);
if ($collected{$part3} eq "") {
$errs_collected{$part3} = sprintf("%s %s %s %s",$part3,$part1,$part2,$part4);
} else {
$collected{$part3}{"data"} = sprintf("%s %s %s %s %s",$collected{$part3}{"data"},$part1,$part2,$part4);
$collected{$part3}{"cnt"} += 3;
}
}
#
close (secondFILEPTR);
#
foreach $partname ( keys %collected ) {
if ($collected{$partname}{"cnt"} eq 7) {
printf("%s\n",$collected{$partname}{"data"});
}
}
printf("Errors to follow:\n");
foreach $partname ( keys %collected ) {
if ($collected{$partname}{"cnt"} ne 7) {
printf("%s\n",$collected{$partname}{"data"});
}
}
foreach $partname ( keys %errs_collected ) {
printf("%s\n",$errs_collected{$partname});
}

live free or die
harry

Live Free or Die

Hein van den Heuvel · ‎10-12-2004

Ooops,
Forgot to include my perl alternative.
This does not require pre-sorting and as written outputs be order of the second file argument.
As written it allows only 1 match per pair.
A perl script gives you full control over tweaks like that.
caveat: It sucks in the whole first file, so may become too slow for large files.

See below.
Cheers,
Hein.

$ perl x.p x xx
cc xx xx xx xx xx xx
bb 77 88 99 xx xx xx
dd xx xx xx xx xx xx
aa 11 22 33 44 55 66
excess from file 1
gg xx xx xx
excess from file 2
jj xx xx xx
$ cat x.p
$f = shift @ARGV;
open (F,"<$f") or die "Failed to open $f";
while () {
chop;
@w= split;
$x1{$w[1]} = join(' ',$w[0], $w[2], $w[3]);
}
close (F);
$f = shift @ARGV;
open (F,"<$f") or die "Failed to open $f";
while () {
chop;
@w= split;
$j= $w[2];
$f2 = join(' ',$w[0], $w[1], $w[3]);
if ($f1 = $x1{$j}) {
print join(' ',$j, $f1, $f2)."\n";
undef $x1{$j};
} else {
$x2{$j} = $f2;
}
}
print "excess from file 1\n";

foreach $k (keys %x1) {
print "$k $x1{$k}\n" if $x1{$k};
}
print "excess from file 2\n";
foreach $k (keys %x2) {
print "$k $x2{$k}\n";
}

Hein van den Heuvel · ‎10-12-2004

Muthukumar, Sorry but I believe your solution will only work in the example given where the 'exception' record happen to sort after the matching records. If one adds a record with key 'bc' to one of the input files the paste command will get 'out of sync'.

Cheers,
Hein.

Kenneth_18 · ‎10-15-2004

Sorry guys but I've been to busy to responed immediately. I will try out the suggestions and let you know the results.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Help needed in scripting

Help needed in scripting

Re: Help needed in scripting

Re: Help needed in scripting

Re: Help needed in scripting

Re: Help needed in scripting

Re: Help needed in scripting

Re: Help needed in scripting

Re: Help needed in scripting