1748196 Members
2581 Online
108759 Solutions
New Discussion юеВ

Re: Filter Files..

 
SOLVED
Go to solution

Filter Files..

Hello everybody,

I have 2 files, for example:

File 1:
a
b
c
d
e

File2:
b
d

The idea is to read File1 and print all lines except the strings that I have in File2. The result should be:

File3:
a
c
e

Note: The number of lines changes, isn't fix. I have to apply this rotine on oracle files, usually with more than 30000 lines and comparate with other file with 200.

I appreciate if you help in shell script.

Thank you very much

Andre
Andre Augusto
10 REPLIES 10
James R. Ferguson
Acclaimed Contributor

Re: Filter Files..

Hi Andre:

Assuming that the files are sorted; given your files as shown:

# comm -3 file1 file2
a
c
e

...see the manpages for 'comm'.

Regards!

...JRF...
Hein van den Heuvel
Honored Contributor

Re: Filter Files..

>> I have to apply this rotine on oracle files

Hmmm.. this is perfect task for a SQL query!

Besides the comm solution JRF mentions,
the other classic solultion is to grep with a match file:

$ grep -v -f file2 file1
a
c
e

For this solution the files do not need to be sorted but... watch out for surprise matches as entries in file2 will be treated as regulare expressions.

I like perl solutions for this. Stick all lines from the short file in an array. The read the long file and take action based on presence in the array.
Much similar to the problem in:
http://forums12.itrc.hp.com/service/forums/questionanswer.do?threadId=1220926

Here:

$ perl -e 'open B,") {$b{$_}++}; open A,"){print unless $b{$_}}'


hth,
Hein.






Dennis Handly
Acclaimed Contributor
Solution

Re: Filter Files..

>Hein: matches as entries in file2 will be treated as regular expressions.

That's why you use -x for whole line and -F for fixed strings:
fgrep -vx -f file2 file1
OFC_EDM
Respected Contributor

Re: Filter Files..

/usr/xpg4/bin/grep -f 2.txt 1.txt > 3.txt

Note don't use the grep in the regular search path as it doesn't suppor the -f option.

Or use fgrep
The Devil is in the detail.
OFC_EDM
Respected Contributor

Re: Filter Files..

/usr/xpg4/bin/grep

I'm assuming you're on HP-UX when providing the above path.
The Devil is in the detail.
Peter Nikitka
Honored Contributor

Re: Filter Files..

Hi,

O'Kevin : /usr/xpg4/bin/grep
This is the place, where SOLARIS holds its Posix-compliant commands. HP style would be, to set the environment variable UNIX95, like
UNIX95= grep ...
The remark about the different handling of the '-f' option fits to SUN as well, IMHO.

The (new) recommended call under HP-UX (>=11i) seems:
grep -F -x -f 2.txt 1.txt

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"

Re: Filter Files..

Hi guys,

Thank you very much for the suggestions, I didn't know that grep cold compare 2 files and I was trying to write a script. Your ideas simples and objective is all that I need.

Some notes, the files aren't sorted, and the path /usr/xpg4/bin don't exist. I'll use the UNIX95 variable and treat the files.

Thanks JRF, Hein, Dennis, O'Kevin and Pete.

Regards,

Andre
Andre Augusto

Re: Filter Files..

Close
Andre Augusto
Dennis Handly
Acclaimed Contributor

Re: Filter Files..

>I didn't know that grep could compare 2 files

It doesn't really "compare" but it filters.

>I'll use the UNIX95 variable and treat the files.

No need to use UNIX95 since the fine print only mentioned some some very minor difference dealing with -q and errors.

>O'Kevin: Note don't use the grep in the regular search path as it doesn't support the -f option.

Sure it does. You must be thinking about some foreign devil version of grep.

>Peter: The remark about the different handling of the '-f' option fits to SUN as well.

It may fit SUN but it has nothing to do with HP-UX.