Languages and Scripting

parse a file with expressions (grep)

 
SOLVED
Go to solution
Billa-User
Regular Advisor

parse a file with expressions (grep)

hello,

i have to change a existing shell script.

here my issue:

demo file (field separator or ): test.cnf
entry1 entry2
entry1 entry2 var

shell script should find following:
search with : entry1 and entry2 => OK
but should not find
entry1 entry2 var
search with : entry1 and entry2 and var => OK
search with : entry1 and entry2 and notvar => Not-OK

# shell script test.sh
# begin
# if doesn't work when field separator is
entry1=$1
entry2=$2

var=$3

if grep "^[ \t]*${entry1}[ \t]" test.cnf >/dev/null
then
if grep "^${entry1}[ \t]*${entry2}$" test.cnf >/dev/null
then
if grep "^${entry1}[ \t]*${entry2}
[ \t]*${entry2}${var}$" test.cnf >/dev/null
then
echo "entry1 : ${entry1} entry2: ${entry2} var:${var} "
fi
else
echo "entry1 : ${entry1} entry2: ${entry2}"
fi
fi
fi

# begin

regards
15 REPLIES 15
Jean-Luc Oudart
Honored Contributor

Re: parse a file with expressions (grep)

Hi

may be you could use awk. Pass the 3 variables to the awk script. Check $1 and $2 vs entry1 and entry2. if you have a 3rd field (check NF for nb field in current record), the decide to print the relevant result.

awk -v en1=$entry1 -v en2=$entry2 -v var=$var
'{if($1==en1) { if($2==en2)
{ ...
}
' test.cnf

Regards
Jean-Luc
fiat lux
Hein van den Heuvel
Honored Contributor

Re: parse a file with expressions (grep)

I would like perl or awk for this also, as one program activation can do it all versus the convuleted nesting.

As you indeed looking for 3 specific words with option whitespace?

If you need further help, then you may want to reply with a TEXT file attachment showing sample input lines which should, or should not match.

Speaking of which, it is not immediately clear to me if/how you want to deal with multiple matching or partially matching lines.

Using awk you can test for the presence of 'var' using NF = 3 or NF > 2

Anyway... It looks like the 'whitespace' requirement simple was not carried out completely.

Specifically, the second and third grep read: grep "^${entry1}...
Should that not be: grep "^[ \t]*${entry1}

And on the 3rd grep it reads: ${entry2}${var}
So those two words should be adjacent without whitespace?

hth,
Hein
Laurent Menase
Honored Contributor

Re: parse a file with expressions (grep)

why not

grep -v "$entry1 $entry2 "

Dennis Handly
Acclaimed Contributor

Re: parse a file with expressions (grep)

Instead of "grep foo file > /dev/null" you can simply this as:
grep -q foo file

>but should not find: entry1 entry2 var

Are you going to check for an empty string for $var?

Do you have an example of your script, your datafile and your script parms?

>[ \t]

This escape "\t" doesn't work for grep.
Billa-User
Regular Advisor

Re: parse a file with expressions (grep)

parameters for testing:

./test.sh entry1 entry2
./test.sh entry1 entry2 var

the test script should parse file "test.cnf" (attachment)
field separator or .

regards
Billa-User
Regular Advisor

Re: parse a file with expressions (grep)

test script:

test.sh
Billa-User
Regular Advisor

Re: parse a file with expressions (grep)

hello,

i tested following "awk" and it matches the exact entries (field separator or ), is it a good solution ?

awk -v entry1="${entry1}" -v entry2="${entry2}" -v var="${var}" 'BEGIN { FS = "[ \t]*|[ \t]+" }
$1 ~ /^'entry1'$/ && $2 ~ /^'entry2'$/ && $3 ~ /^'var'$/ { print "FOUND" }' test.cnf

if [ `awk -v entry1="${entry1}" -v entry2="${entry2}" -v var="${var}" 'BEGIN { FS = "[ \t]*|[ \t]+" }
$1 ~ /^'entry1'$/ && $2 ~ /^'entry2'$/ && $3 ~ /^'var'$/ { print "FOUND" }' test.cnf` = "FOUND" ]
then
echo "awk: FOUND"
fi

also a test with "grep" :
OLDIFS=$IFS
IFS="[ \t]*|[ \t]+"
grep "^${entry1} ${entry2} ${var}$" test.cnf

IFS=$OLDIFS
James R. Ferguson
Acclaimed Contributor

Re: parse a file with expressions (grep)

Hi:

> i tested following "awk" and it matches the exact entries (field separator or ), is it a good solution ?

OK, but for the 'grep' code, you haven't paid attention to what Dennis said about the use of '\t' for a TAB character: HP's 'grep' doesn't recognize it and you are being fooled into thinking it does when you use

# grep "^[ \t]*"

Consider that this matches:

# echo "aaa"|grep "^[\t ]*"
aaa

You can match because the asterisk ("*") matches ZERO or more characters. You have none in the example above, and that *meets* the criteria.

Further, the '\t' isn't understood by 'grep' in HP-UX. If it was, this would work:

# echo "\t"|grep "[\t]"

...and it does *not*. You could use a literally composed TAB (which the Forums formatting will obliterate unless you copy-and-paste this:

# echo "\t"|grep "[ ]"

AWK *does* understand the '\t' notation, however.

If you use 'grep -E' to invoke the extended regular expression engine, you can use '+' to signify ONE or more instances of the preceding character.

AWK supports extended regular expresssions, so this can be used there, too.

Regards!

...JRF...

Dennis Handly
Acclaimed Contributor

Re: parse a file with expressions (grep)

>I tested following "awk" and it matches the exact entries (field separator or ), is it a good solution?

Since this is the default, you don't want to set FS.
And if you are going to use awk, you shouldn't have to invoke it more than once.

>also a test with grep:
>IFS="[ \t]*|[ \t]+"
>grep "^${entry1} ${entry2} ${var}$" test.cnf

grep doesn't look at IFS, only the shell and you don't have the proper format for IFS since it normally has: space, tab, and newline