Operating System - HP-UX
1834628 Members
3045 Online
110069 Solutions
New Discussion

command or script for equal lines in files

 
SOLVED
Go to solution
Franky Leeuwerck_1
Regular Advisor

command or script for equal lines in files

Hello,

I need a HP-UX command or simple script that outputs the lines that are occuring in two textfiles (like the opposite of the diff command).

Example :
File A contains
Abc
Acc
Bcc

Fil B contains
Acc
Acdc
Bcc
Bde

The desired output would be :
Acc
Bcc


Thanks for your help.

Franky
15 REPLIES 15
Mark Grant
Honored Contributor

Re: command or script for equal lines in files

There probably is a command to do this but something like

while read $line
do
grep $line file2
done < file1

AQ bit slow for big files though so maybe awk or perl is what you need.
Never preceed any demonstration with anything more predictive than "watch this"
Pete Randall
Outstanding Contributor

Re: command or script for equal lines in files

Franky,

Try "man 1 comm".


Pete

Pete
Pete Randall
Outstanding Contributor

Re: command or script for equal lines in files

Specifically, I think you want

"comm -3 FileA FileB"


Pete

Pete
H.Merijn Brand (procura
Honored Contributor
Solution

Re: command or script for equal lines in files

easy:

comm -12 fileA fileB

but files have to be sorted

if not

perl -e'$f=pop;%l=map{$_=>1}<>;@ARGV=($f);for(<>){exists$l{$_}and print}' fileA fileB

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Franky Leeuwerck_1
Regular Advisor

Re: command or script for equal lines in files

Wow,

Thanks for these replies coming so fast.

I only knew of the cmp and diff command, but indeed comm -12 does the job.

Thanks,Franky
John Palmer
Honored Contributor

Re: command or script for equal lines in files

Provided you two files are sorted (they are in your example) you can use the comm command.

comm -12 A B

gives you the required output.

Regards,
John
Elmar P. Kolkman
Honored Contributor

Re: command or script for equal lines in files

Or:
sed -e 's|^|^|' -e 's|$|$|' fileA >fileA.tmp
grep -f fileA.tmp filB
Every problem has at least one solution. Only some solutions are harder to find.
H.Merijn Brand (procura
Honored Contributor

Re: command or script for equal lines in files

elmar, that is fun! what if file a has a line

.*

? :) :)

you could do that with GNU grep

# gnu-fgrep -x -f fileA fileB

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Elmar P. Kolkman
Honored Contributor

Re: command or script for equal lines in files

If grep wildcards like that exist, you could extent the sed line to escape them ('s|\([.*+?]\)|\[\1\]|g').
Every problem has at least one solution. Only some solutions are harder to find.
Jean-Louis Phelix
Honored Contributor

Re: command or script for equal lines in files

Hi,

I like the funny Elmar's one ... It should work even with lines like .* if he uses 'grep -Fx' :^)

Regards.
It works for me (© Bill McNAMARA ...)
Elmar P. Kolkman
Honored Contributor

Re: command or script for equal lines in files

Jean-Louis, it was something I tried, but it doesn't work, since not only the . and * are not interpreted, but the ^ and $ too, meaning that begin-of-line and end-of-line are not interpreted. And you need them to prevent lines like 'abcd' to show up if you have 'bcd' in the other file...
With the capitals in the example it should work better (start of line is forced with the capital) but if Accccdefed was in filB, it would show up...
Every problem has at least one solution. Only some solutions are harder to find.
Jean-Louis Phelix
Honored Contributor

Re: command or script for equal lines in files

Elmar,

I meant that I liked your grep -f tip. But with -Fx you don't even need the sed and the temporary file :

-x for exact lines (so no sed needed for ^ and $)

-F for fixed strings (prevent from lines containing .*)

So the answer could simply be :

grep -Fxf fileA fileB

which gives the same result as :

comm -12 fileA fileB

In fact in this case there is even no need to work on sorted files which is easier.

Regards.
It works for me (© Bill McNAMARA ...)
H.Merijn Brand (procura
Honored Contributor

Re: command or script for equal lines in files

but comm is cut for the job. Look at the following benchmark, assumed that the files _are_ sorted:

lt09:/tmp 166 > perl -le'print chr(65+int rand 26),map{chr(97+int rand 26)}1..5 for 1..50000' | sort > x0
lt09:/tmp 167 > perl -le'print chr(65+int rand 26),map{chr(97+int rand 26)}1..5 for 1..50000' | sort > x1
lt09:/tmp 168 > time comm -12 x0 x1 > x01.0
0.020u 0.000s 0:00.00 0.0% 0+0k 0+0io 123pf+0w
lt09:/tmp 169 > time grep -Fxf x0 x1 > x01.1
0.300u 0.020s 0:00.31 103.2% 0+0k 0+0io 136pf+0w
lt09:/tmp 170 > perl -le'print chr(65+int rand 26),map{chr(97+int rand 26)}1..5 for 1..500000' | sort > x0
lt09:/tmp 171 > perl -le'print chr(65+int rand 26),map{chr(97+int rand 26)}1..5 for 1..500000' | sort > x1
lt09:/tmp 172 > time comm -12 x0 x1 > x01.0
0.150u 0.010s 0:00.08 200.0% 0+0k 0+0io 123pf+0w
lt09:/tmp 173 > time grep -Fxf x0 x1 > x01.1
4.230u 0.060s 0:04.32 99.3% 0+0k 0+0io 136pf+0w
lt09:/tmp 174 >

this was on linux, where GNU grep is the default, but HP-UX will show similar results

So *IF* you would decide to go for fgrep (equal to grep -F), be sure to put the smallest file in front

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
H.Merijn Brand (procura
Honored Contributor

Re: command or script for equal lines in files

Ohh, and one more thing.

the commands are not the same, and with the fgrep command the output might differ if the files are swapped. This is because handling of double lines is different in comm and grep.
watch this, from the last example:


lt09:/tmp 178 > ll x[01]*
246644 -rw-rw-rw- 1 merijn users 3500000 2004-01-24 18:42 x0
246692 -rw-rw-rw- 1 merijn users 5544 2004-01-24 18:42 x01.0
246693 -rw-rw-rw- 1 merijn users 5551 2004-01-24 18:42 x01.1
246691 -rw-rw-rw- 1 merijn users 3500000 2004-01-24 18:42 x1
lt09:/tmp 179 > diff x01*
444a445
> Pgpcds
Exit 1
lt09:/tmp 180 > grep Pgpcds x[01]*
x0:Pgpcds
x01.0:Pgpcds
x01.1:Pgpcds
x01.1:Pgpcds
x1:Pgpcds
x1:Pgpcds
lt09:/tmp 181 >

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
H.Merijn Brand (procura
Honored Contributor

Re: command or script for equal lines in files

Same problem discussed in http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=403456
(FYI only)

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn