Operating System - HP-UX
1834820 Members
2739 Online
110070 Solutions
New Discussion

script - comparing two files

 
SOLVED
Go to solution
Ravinder Singh Gill
Regular Advisor

script - comparing two files

Guys,

I have two files with a big list of entries in each of them. I need to compare all the entries in file1 with that of file2 and check which of them do not exist in file2. However they are not necessarily in the same order in both the files, ie an entry could be at the top of file1 and in the middle of file2. I am trying to write a script to do this. Any helpful tips?
21 REPLIES 21
john korterman
Honored Contributor

Re: script - comparing two files

Hi,

you could use
comm
for that purpose, but that assumes that the files have beeen sorted first.

regards,
John K.
it would be nice if you always got a second chance
Muthukumar_5
Honored Contributor

Re: script - comparing two files

Answer is,

# cat > file1
hi
bye
# cat > file2
ok
sure
bye
noe
hi
# grep -vf file1 file2
ok
sure
noe
#

Hope this is the one you wanted?
Easy to suggest when don't know about the problem!
RAC_1
Honored Contributor

Re: script - comparing two files

First sort files.

sort < file1 > file1-1
sort < file2 > file2-2

Now
comm -13 file1-1 file2-2

You may also look at bdiff, diff, cmp commands.
There is no substitute to HARDWORK
Ravinder Singh Gill
Regular Advisor

Re: script - comparing two files

I was thinking of keeping it simpler i.e. something as below but will it not work?

While read Variable

do
more file2 | grep Variable

#put in entries to say if it does not exist
#then output that variable to file3.

done < file1


Would something like this work? If so how do I instruct it to redirect the variable to another file if it did not exist in file2??

MarkSyder
Honored Contributor

Re: script - comparing two files

Your code looks ok but a little too complicated. You have more file2|grep Variable, whereas:

grep Variable file2

would achieve exactly the same result.

You could either send the output to file3 within the loop or on the command line. Command line version:

scriptname > outputfile

Mark Syder (like the drink but spelt different)
The triumph of evil requires only that good men do nothing
Ravinder Singh Gill
Regular Advisor

Re: script - comparing two files

Thanks for that, but I do not want to send the output to a file. I wish to send the variable to a third file if it does exist in file1 and does not exist in file 2.

Any ideas?
MarkSyder
Honored Contributor

Re: script - comparing two files

Try this:

if grep Variable file1
then
do
echo Variable >> file3
done

The >> means append - if you use > every time you find a variable the file will be overwritten. It's probably a good idea to null the file before the loop starts.

Mark
The triumph of evil requires only that good men do nothing
Orhan Biyiklioglu
Respected Contributor

Re: script - comparing two files

grep -vf file2 file1 > file2

hth
Muthukumar_5
Honored Contributor

Re: script - comparing two files

Simply change my reply:


# cat file1
bye
hi
not existing
# cat file2
ok
sure
bye
noe
hi
# grep -vf file2 file1
not existing
# grep -vf file2 file1 > file3

Hope this one rgt.
Easy to suggest when don't know about the problem!
Cem Tugrul
Esteemed Contributor

Re: script - comparing two files

grep -vf file2 file1 > file2
Good Luck,
Our greatest duty in this life is to help others. And please, if you can't
Muthukumar_5
Honored Contributor

Re: script - comparing two files

grep -vf file2 file1 > file2

this will not work. It is dangerous more. It will change file2 contents more.

# cat file1
bye
hi
not existing
# cat file2
ok
sure
bye
noe
hi
#
# grep -vf file2 file1 > file2
#
# cat file2
bye
hi
not existing

Try to redirect to another file called file3.

hth.
Easy to suggest when don't know about the problem!
Orhan Biyiklioglu
Respected Contributor

Re: script - comparing two files

Muthukumar is right.

Sorry for the typo.

It should be

grep -vf file2 file1 > file3

hth
john korterman
Honored Contributor

Re: script - comparing two files

Hi again,

if I understand it correctly you want to compare lines(names) in file1 to lines in file2 and then write to file3 the names that are not common in both files.
You can do this by the following script, compare.sh:

#!/usr/bin/sh
while read name
do
grep -qc $name $1 1>/dev/null
if [ "$?" != 0 ]
then
echo $name >> $3
fi
done < $2

however, you have to run it twice, e.g. like this:
# compare.sh file1 file2 file3
and then
# compare.sh file1 file2 file3

for first writing to file3 the names that exist in file1 but not in file2 and then second run for writing(appending) the vice-versa occurrances..

Try first running it first like this;
# compare.sh file1 file2
in order to check the result, and remember each run appends to file3

regards,
John K.
it would be nice if you always got a second chance
Hein van den Heuvel
Honored Contributor

Re: script - comparing two files

This, and much similar questions have been asked several times before in this forum. Google will readily find them with: +"compare files" +site:itrc.hp.com


For example:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=414769

That question was concerned with detail layout of the lines, you may also have optional input in that space.
Taken the solution there in generic for a solution for your problem could be...

perl compare.pl file1 file2

---- compare.pl -------------

$file1 = shift @ARGV;
$file2 = shift @ARGV or die;
open (FILE, "<$file1");
while () {
chomp;
$x{$_}=1;
}
close (FILE);
open (FILE, "<$file2");
while () {
chomp;
if (defined $x{$_}) {
$x{$_}=2;
} else {
$x{$_} =3; # report only once per matching line
print "Not in file 1: $_\n";
}
}
foreach (keys %x) {
print "Not in file 2: $_\n" if ($x{$_} == 1) ;
}

-------------------

hth,
Hein.
Raj D.
Honored Contributor

Re: script - comparing two files

Hi Ravinder ,

You can also use the diff command.

Cheers,
Raj.
" If u think u can , If u think u cannot , - You are always Right . "
Rory R Hammond
Trusted Contributor

Re: script - comparing two files

cat file1
goodbye
nostuff
hello
morestuff
stuff

cat file2
just stuff
stuff
nostuff
hello

########
sort file1 -o file3 #lets not screw up orgs
sort file2 -o file4

comm -23 file3 file4
goodbye
morestuff

or
comm -23 file3 file4 > lines_not_in_file1

Rory
There are a 100 ways to do things and 97 of them are right
Ravinder Singh Gill
Regular Advisor

Re: script - comparing two files

Can anyone tell me what is wrong with the following script?

While read Variable
do
grep Variable file2

if
[ $? != 0 ]
then
Variable >> file3
fi
done < file1


Remember I want to output the variable to file3 only if it exists in file1 but NOT in file2.
Muthukumar_5
Honored Contributor
Solution

Re: script - comparing two files

Use this:

rm -f file3
while read Variable
do
grep -q $Variable file2

if [ $? != 0 ]
then
echo $Variable >> file3
fi
done < file1

hth.
Easy to suggest when don't know about the problem!
Muthukumar_5
Honored Contributor

Re: script - comparing two files

A small correction.

When you are having a line like,

word1 word2 ... wordn then,

grep -q $Variable file2 will make problem. It is reading without space only. To avoid that,

use:

rm -f file3
while read Variable
do
# New change for above correction.
grep -q "$Variable" file2

if [ $? != 0 ]
then
echo $Variable >> file3
fi
done < file1

hth.
Easy to suggest when don't know about the problem!
john korterman
Honored Contributor

Re: script - comparing two files

Hi again,

"Can anyone tell me what is wrong with the following script?

While read Variable
do
grep Variable file2
if [ $? != 0 ]
then
Variable >> file3
fi
done < file1
"

The read statement assigns a value to Variable. In order to use the content you have to expand the variable using the $ character, i.e. $Variable, e.g.:
grep $Variable file2
And if you want to write to a file, you also have to specify that, e.g.:
echo $Variable >>file3
The last thing is a bit more complicated: grep writes its output to std. out., which defaults to the terminal. This means that this statement:
grep gill file1
will write
gill
to std. out/your terminal, assuming that gill is found in file1.

Although the reurn code for the above grep statement is 0, it writes
gill
to std. out, and then the statement
echo $Variable >> file3
appends all std. out. to file3.
Therefore, you need to prevent the grep statement from writing to std. out, e.g. by adding the -q option as suggested by Mukthukumar. An alternative is to redirect std. out. e.g. by the 1>/dev/null
If grep is silent then
echo $Variable >>file3
will in this case control what is written to file3.

regards,
John K.
it would be nice if you always got a second chance
Ravinder Singh Gill
Regular Advisor

Re: script - comparing two files

Muthukumar answer is what I needed. Thanks guys for your help.