1828370 Members
3538 Online
109976 Solutions
New Discussion

Re: Grep question

 
Sachin_29
Advisor

Grep question

I am trying to grep for a word ABC in a line withn a file
The file test contains
The word is ABC
Something ABC.orig
when i do grep ABC test|wc -l . It returns 2
Is there any way i could grep for the exact word ABC?
23 REPLIES 23
Carlo Corthouts
Frequent Advisor

Re: Grep question

If ABC.orig is also in that file you can try :

grep ABC test|grep -v ABC.orig | wc -l

Sachin_29
Advisor

Re: Grep question

sorry i meant there could be more words where like ABC.dontknow ABC234 etc
Pete Randall
Outstanding Contributor

Re: Grep question

Try

grep " ABC " test


Pete

Pete
RAC_1
Honored Contributor

Re: Grep question

grep -x [A]BC file

Else get GNU grep.


Anil
There is no substitute to HARDWORK
Sachin_29
Advisor

Re: Grep question

doesnt help pete!!!
Franky_1
Respected Contributor

Re: Grep question

Hi,

just

grep " ABC" test

Regards

Franky
Don't worry be happy
Carlo Corthouts
Frequent Advisor

Re: Grep question

Try grep -w ABC
Simon Hargrave
Honored Contributor

Re: Grep question

You'll need anchors: -

If you want to grep for ABC as part of a line: -

grep " ABC " file (note the spaces)

If you want a line with ONLY ABC: -

grep "^ABC$" file (^ anchors start of line, $ anchors end)
Simon Hargrave
Honored Contributor

Re: Grep question

If you want to anchor to spaces AND tabs (which may be valid depending on your file), then: -

grep "[ | ]ABC[ | ]" file

Where the brackets contain "space pipe tab"
Rick Garland
Honored Contributor

Re: Grep question

`grep -x` will do an EXACT match.
John Palmer
Honored Contributor

Re: Grep question

How about:

grep -e " ABC " -e " ABC$" -e "^ABC " -e "^ABC$"

That will find:
1. ABC surrounded by spaces.
2. ABC ending a line
3. ABC starting a line
4. A line consisting only of ABC

Regards,
John
Carlo Corthouts
Frequent Advisor

Re: Grep question

Or you can do

#man grep
Sachin_29
Advisor

Re: Grep question

John I am getting Illegal Variable name?
Pete Randall
Outstanding Contributor

Re: Grep question

The best solution then is the grep -x syntax.
$ grep -x ABC test
ABC
$


Pete

Pete
Sachin_29
Advisor

Re: Grep question

Pete:
the grep -x ABC doesnt work as it looks for the exact line ..the lines are "The word is ABC" and "Something ABC.orig" so if i do grep -x ABC test|wc -l it returns 0
Rick Garland
Honored Contributor

Re: Grep question

The grep -x does not look for exact lines, it looks for exact patterns.

If you have ABC on a line then grep -x ABC will match. If ABC has multiple and different patterns then the grep -x will not match.
RAC_1
Honored Contributor

Re: Grep question

grep -w [A]BC

Else get GNU grep

Anil
There is no substitute to HARDWORK
Sachin_29
Advisor

Re: Grep question

hey
grep ABC$ test works
Bill Hassell
Honored Contributor

Re: Grep question

The problem is in the different possibilities of a "word". Typically, a word is defined as text surrounded by whitespace. ABC on a line all by iteself is followed by the whitespace character NEWLINE. In your working example ABC$, grep is looking for ABC at the exact end of the line. ABC$ will not find ABC followed by a space or any other character. The -w option is only for current patches for grep but should work for your requirement.


Bill Hassell, sysadmin
Bill Hassell
Honored Contributor

Re: Grep question

The problem is in the definition of a "word". Typically, a word is defined as text surrounded by whitespace. ABC on a line all by itself is followed by the whitespace character NEWLINE. In your working example ABC$, grep is looking for ABC at the exact end of the line. ABC$ will not find ABC followed by a space or any other character. The -w option is only for current patches for grep but per the man page, -w defines a 'word' as a string surrounded by anything that is not alphanumeric or and underscore. Thus, ABC.orig has 2 words ABC and orig with a non-word constituent character in between.


Bill Hassell, sysadmin
Suresh Pai
Advisor

Re: Grep question

A slight variation of John Palmer's solution is to ensure that ABC is surrounded by whitespaces..(else his solution just boils down to a grep ABC).

Something like:
grep "[ \t]ABC[ \t]\|[ \t]ABC$\|^ABC[ \t]\|^ABC$" test |wc -l

However, both our solutions do not give you an accurate count of no of ABCs (if more than one is present on the same line).
Rick Garland
Honored Contributor

Re: Grep question

A good point.
The `wc -l` does not provide a count of the number of instances of ABC, just the number of lines it is found on.

So if it present 5 times on 1 line, the return from the `wc -l` will be 1. This pattern was only found on 1 line.
Hein van den Heuvel
Honored Contributor

Re: Grep question

Please consider using 'advance search' (aka 'more options') to look for prior discussion before entering.
Search for 'grep word' gives numerous useful prior topics.

perl's \b is great for this wrok as per:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=228233


A couple of months ago I suggested:

grep -E '(^|[ \t])searchstring([ \t]|$)' file

In English:

Look for a piece of searchstring starting at (the being of a line or with (space or tab)) and ending with ((space or tab) or the end of a line)

That was in topic:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=284829


hth,
Hein.