Re: Need help

suchitra · ‎03-21-2006

I have a problem with comparing the columns of 2 files and write a status like add, modified or delete in the outfile along with the records of these 2 file.
For ex :

consider my 1st file has say
file1
==========
p1|y|500
p2|n|500

file2
======
p1| n| 500
p3 | y|501

now my output file should be like this

output file
=========
p1 | n| 500 |c b'cas it has been changed from y to n in the 1st file
p2| n |500 | d b'cas p2 is deleted in the 2nd file
p3 | y| 501 | a b'cas p3 is added in the 2nd file

could you please help me out. I want either a shell script or a awk command .

Muthukumar_5 · ‎03-21-2006

Use this:

awk -F"\|" '{ var=$0;var1=$1;var2=$2;var3=$3;getline < "file2";split($0,a,"|"); if ( a[1] == var1 ) { if ( a[2] != var2 ) { print var"## c bcas "var2" is changed to "a[2];} if ( a[3] != var3 ) { print var"## c bcas "var3" is changed in 2nd file";} }
else { print var"## d bcas "var1" is deleted in 2nd file";print $0" ## a bcas "a[1]" is deleted in 2nd file";}}' file1

--
Muthu

Easy to suggest when don't know about the problem!

Steve Steel · ‎03-21-2006

Hi

Look at the comm command

comm - select or reject lines common to two sorted files

And www.shelldorado.com

Steve Steel

If you want truly to understand something, try to change it. (Kurt Lewin)

Peter Godron · ‎03-21-2006

Suchitra,
my ititial solution:

#!/usr/bin/sh
echo "Changed"
echo "`join -j 1 -t '|' -o 1.1 2.2 1.3 file1 file2`|c"
echo "Deleted"
cut -f1 -d '|' file1 > file1.bck
cut -f1 -d '|' file2 > file2.bck
grep `comm -23 file1.bck file2.bck` file1 > file1.res
sed "1,$ s/$/|d/" file1.res
rm file1.res
echo "Added"
grep `comm -13 file1.bck file2.bck` file2 > file2.res
sed "1,$ s/$/|a/" file2.res
rm file2.res
rm file1.bck
rm file2.bck

Peter Godron · ‎03-21-2006

Muthukumar,
very smooth script!
The first "print var" could be replaced by:
print a[1]"|"a[2]"|"a[3]
to pick up the file2 values, rather than file1.

Senthil Kumar .A_1 · ‎03-21-2006

Hi,

I have attached a script that uses "comm" command for your situation.

Regards,
Senthil Kumar .A

Let your effort be such, the very words to define it, by a layman - would sound like a "POETRY" ;)

Muthukumar_5 · ‎03-21-2006

Peter,

To avoid to print a whole line as a[1]"|"a[2]"|"a[3], I have stored in a separate variable. It is help ful to simply script ;)

--
Muthu

Easy to suggest when don't know about the problem!

Peter Godron · ‎03-21-2006

Suchitra,
something else to keep in mind is that any script using comm would only work on sorted files.

Also, are there any duplicate keys like:
p1|y|500
p1|n|300
.
.

Muthukumar_5 · ‎03-21-2006

Using perl:

#!/usr/bin/perl

open FD1,"file1" || die "Open Error: $!";
open FD2,"file2" || die "Open Error: $!";

@arr1=;
@arr2=;

for ($i=0;$i<@arr1;$i++)
{
@pat1=split (/\|/,$arr1[$i]);
@pat2=split (/\|/,$arr2[$i]);

$arr1[$i]=~chomp($arr1[$i]);
$arr2[$i]=~chomp($arr2[$i]);

if ( $pat1[0] eq $pat2[0] )
{
if ( $pat1[1] ne $pat2[1] )
{

print "$arr1[$i] | c b'cas it has been changed from $pat1[1] to $pat2[1] in field 2 in 2nd file\n";
}
if ( $pat1[2] ne $pat2[2] )
{
print "$arr1[$i] | c b'cas it has been changed from $pat1[2] to $pat2[2] in field 3 in 2nd file\n";
}

}
else
{
print "$arr1[$i] | d b'cas $pat1[0] is deleted in 2nd file\n";
print "$arr2[$i] | a b'cas $pat1[0] is added in 2nd file\n";
}

}

# END

--
Muthu

Easy to suggest when don't know about the problem!

Peter Godron · ‎03-24-2006

Suchitra,
do these answers solve your problem?
Can you please have a look at:
http://forums1.itrc.hp.com/service/forums/helptips.do?#28
and then update the record.

Hein van den Heuvel · ‎03-24-2006

Hmmm, Muthu... I fail to see how you can solve the problem described with the simple array comparison you suggest. It seems clear to me that any solution needs to focus on the first, 'key' field.
How else can one decided whether a a new record appeared in the same place where an old record was deleted?

Anyway...

For a large file, it probably needs to be pre-sorted and two two files read simultaneously comparing key values to keep then in sync.
I presented one example of this is in:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=999120

For small files you can just 'slurp' them into a perl associative array, and report based on keys.

Here is an example that only compares the first non-field. It is easily adapted to compare other fields, or just everything except the key.

------ compare.pl ---------
$file = shift;
open (FILE, "<$file") or die "Failed to open first file: $file.";
while () {
chomp;
($key,$flag,$num) = split (/\|/, $_);
print "$. $key,$flag,$num\n";
$f1_flag{$key} = $flag;
$f1_num{$key} = $num;
}

$file = shift;
open (FILE, "<$file") or die "Failed to open second file: $file.";
while () {
chomp;
($key,$flag,$num) = split (/\|/, $_);
print "$. $key,$flag,$num\n";
$f2_flag{$key} = $flag;
$f2_num{$key} = $num;
}

for $key (sort keys %f2_flag) {
$change = "=";
if (defined $f1_flag{$key}) {
$change = "c" if ($f1_flag{$key} ne $f2_flag{$key});
delete $f1_flag{$key};
} else {
$change = "a";
}
print "$key|$f2_flag{$key}|$f2_num{$key}| $change\n"
}

for $key (sort keys %f1_flag) {
print "$key|$f1_flag{$key}|$f1_num{$key}| d\n"
}

---- usage example ----

C:\Temp>type file1.tmp
p1|y|500
p2|n|500
p5|y|500
C:\Temp>type file2.tmp
p1|n|500
p3|y|501
p5|y|500
C:\Temp>perl tmp.pl file1.tmp file2.tmp
p1|n|500 | c
p3|y|501 | a
p5|y|500 | =
p2|n|500 | d

if this input is treated as 'key' and everything else then it simplyfies some

------- compare_2.pl ----------

$file = shift;
open (FILE, "<$file") or die "Failed to open first file: $file.";
while () {
chomp;
$f1{$`} = $' if /\|/;
}

$file = shift;
open (FILE, "<$file") or die "Failed to open second file: $file.";
while () {
chomp;
$f2{$`} = $' if /\|/;
}

for $key (sort keys %f2) {
$change = "=";
if (defined $f1{$key}) {
$change = "c" if ($f1{$key} ne $f2{$key});
delete $f1{$key};
} else {
$change = "a";
}
print "$key|$f2{$key}| $change\n"
}

for $key (sort keys %f1) {
print "$key|$f1{$key}| d\n"
}

Hein van den Heuvel · ‎03-24-2006

Mind you, Muthu's script may be great IF the data follows strict patterns. But I do not think it works as requested. For example with a single deleted line:

C:\Temp>type file1.tmp
p1|y|500
p2|n|500
p3|y|500
p4|y|500
p5|y|500
C:\Temp>type file2.tmp
p1|n|500
p3|y|500
p4|y|500
p5|y|500
C:\Temp>perl test.pl
p2|n|500 | d b'cas p2 is deleted in 2nd file
p3|y|500 | a b'cas p2 is added in 2nd file
p3|y|500 | d b'cas p3 is deleted in 2nd file
p4|y|500 | a b'cas p3 is added in 2nd file
p4|y|500 | d b'cas p4 is deleted in 2nd file
p5|y|500 | a b'cas p4 is added in 2nd file
p5|y|500 | d b'cas p5 is deleted in 2nd file
| a b'cas p5 is added in 2nd file

Using the script I suggest:

p1|n|500 | c
p3|y|500 | =
p4|y|500 | =
p5|y|500 | =
p2|n|500 | d

Of course my method will fail if the order is critical, and not this column 1 value.

If the "=" lines are not desirable, then change the code to make the print conditional:
print "$key|$f2{$key}| $change\n" unless $change eq "=";

or re-arrange the core look some. For example:

for $key (sort keys %f2) {
$change = "c";
if ($x = $f1{$key}) {
delete $f1{$key};
next if ($x eq $f2{$key});
} else {
$change = "a";
}
print "$key|$f2{$key}| $change\n"
}

Result for last input example:

C:\Temp>perl compare.pl file1.tmp file2.tmp
p1|n|500 | c
p2|n|500 | d

Ok, lunch break over, back to real work...
:-)

Hein.

Sandman! · ‎03-24-2006

Hi Suchitra,

I'ave pasted a shell script below that satisfies the requirements for parsing and filtering input files according to your criteria:

========================myparser.sh========================
#!/bin/sh

set -a

InFile1=f1
InFile2=f2
#
SortFile1=f1s
SortFile2=f2s
#
OutFile=outfile

# Zero out the outputfile
# before input processing
cat /dev/null > $OutFile

# Sort both file 1 and 2 on the first field
# using the vertical bar as field separator
sort -t"|" -k1 $InFile1 > $SortFile1
sort -t"|" -k1 $InFile2 > $SortFile2

# Filter out lines common to both files
# and print out those that have changed
join -t"|" $SortFile1 $SortFile2 | awk -F"|" '
BEGIN {OFS="|"}
{if($2!=$4)print $1,$4,$NF,"c b'\''cas "$1" changed from "$2" to "$4" in 2nd file"}' >>$OutFile

# Print out all the unmatched lines in
# sorted file 1 and flag them as deleted
join -t"|" -v1 $SortFile1 $SortFile2 | awk -F"|" '
BEGIN{OFS="|"} {print $0,"d b'\''cas "$1" is deleted in 2nd file"}' >>$OutFile

# Print out all the unmatched lines in
# sorted file 2 and flag them as added
join -t"|" -v2 $SortFile1 $SortFile2 | awk -F"|" '
BEGIN{OFS="|"} {print $0,"a b'\''cas "$1" is added in 2nd file"}' >>$OutFile
===========================================================

Copy the code into a file name of your choice, customize the environment (InFile, SortFile etc.) to your system, make it executable and run it at the command line.

hope it helps!

Sandman! · ‎03-24-2006

Matter of fact the code might be easier to understand (as well as copy 'n paste) if attached instead of pasted...so click on the attachment for the shell script.

cheers!

suchitra · ‎03-26-2006

Thanks a lot guys ... I got the solution I wanted. It was a great help from all of you guys.
Thanks a lot.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Need help

Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help

Re: Need help