Operating System - Linux
1752850 Members
3455 Online
108790 Solutions
New Discussion юеВ

Re: regexp vs. GNU regexp

 
SOLVED
Go to solution
Dusan Onofer
Occasional Contributor

regexp vs. GNU regexp

Hi,

i'm trying to get one script working with hp-ux's grep (porting from GNU's grep) and i've found different behavior. I'm asking here if someone could explain it:

Text (from comma separated values) which is GNU's grep (in Cygwin) selecting (according to my opinion) correctly and (extened) regular expresion is like this:
^X:([^,]*,){3}"C

i'm testing it (fourth comma separated value is like "C):
$ echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){3}"C'
X: A,,B,"C",,"D"

ok, match

$ echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){4}"C'

ok, it doesn't match

But with hp-ux's grep:
$ echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){3}"C'
X: A,,B,"C",,"D"

match, ok.

$ echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){4}"C'
X: A,,B,"C",,"D"

why this match?

echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){5}"C'
X: A,,B,"C",,"D"

also match. Why?

$ uname -a
HP-UX ccbdevd1 B.11.11 U 9000/800 610389312 unlimited-user license

thanks a lot!

Dusan
5 REPLIES 5
harry d brown jr
Honored Contributor
Solution

Re: regexp vs. GNU regexp

Dusan,

It does appear to be a bug :-(

manually expanded it works:

[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){5}"C'
X: A,,B,"C",,"D"
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,)([^,]*,)([^,]*,)([^,]*,)([^,]*,)"C'
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,)([^,]*,)([^,]*,)([^,]*,)([^,]*,)"C'
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,)([^,]*,)([^,]*,)([^,]*,)"C'
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,)([^,]*,)([^,]*,)"C'
X: A,,B,"C",,"D"
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,)([^,]*,)"C'
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,)"C'
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){2}"C'
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){3}"C'
X: A,,B,"C",,"D"
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){4}"C'
X: A,,B,"C",,"D"
[root@vpart1 /]# echo 'X: A,,B,"C",,"D"'|grep -E '^X:([^,]*,){6}"C'
[root@vpart1 /]#


Even with regexbuilder (http://sourceforge.net/project/showfiles.php?group_id=76775) any variation other than
^X:([^,]*,){3}"C
is invalid

Report it as a bug.

I personally would use 'cut -d"," -f4' with grep "\"C\"", but that's not really the reported problem.

live free or die
harry d brown jr
Live Free or Die
Dusan Onofer
Occasional Contributor

Re: regexp vs. GNU regexp

Harry,

thanks for your reply.

It seems to me, it's in libc or where -- sed's behaviour is same.

$ echo 'X: A,,B,"C",,"D"'|sed 's/^X:\([^,]*,\)\{2\}"C/deleted/'
X: A,,B,"C",,"D"
$ echo 'X: A,,B,"C",,"D"'|sed 's/^X:\([^,]*,\)\{3\}"C/deleted/'
deleted",,"D"
$ echo 'X: A,,B,"C",,"D"'|sed 's/^X:\([^,]*,\)\{4\}"C/deleted/'
deleted",,"D"
$ echo 'X: A,,B,"C",,"D"'|sed 's/^X:\([^,]*,\)\{5\}"C/deleted/'
deleted",,"D"

also attached program reports false match.

btw, where can I report a bug?
(sorry, I've tryied "googling" over hp.com but didn't find
any useful links).

Dusan Onofer
Occasional Contributor

Re: regexp vs. GNU regexp

compilable version.

Dusan
harry d brown jr
Honored Contributor

Re: regexp vs. GNU regexp


You might find this tool very useful:

http://www.weitz.de/regex-coach/#install

This proves to be the best pattern:

^(X: )([^,]*,)([^,]*,)([^,]*,)(\"C\"\,)([^,]*,)([^,]*)

Using a ([^,]*,){3} instead of manually repeating it causes the pattern to backtrace and grab the same pattern twice. Try your different variations and use the STEP tab (near bottom of window) to walk through the string.



live free or die
harry d brown jr
Live Free or Die
Marlou Everson
Trusted Contributor

Re: regexp vs. GNU regexp

Dusan,

You can report bugs through "Support Case Manager" under the "Maintenance and support for hp products" link.

Marlou