Operating System - HP-UX
1834276 Members
2811 Online
110066 Solutions
New Discussion

Regular expression matching

 
Simon Hargrave
Honored Contributor

Regular expression matching

Trying to get my head round a regexp problem.

From swlist, you can get, for a piece of installed software, the os_release(s) it can run on. This will be an extended regular expression, like the following examples: -

B.11.00|B.11.11|B.11.20
B.11.*
?.10.*|?.11.*

All three of these should match "B.11.11", for example. So to test this: -

echo "B.11.11" | grep -E "B.11.00|B.11.11|B.11.20" (this works)

echo "B.11.11" | grep -E "B.11.*" (this works)

echo "B.11.11" | grep -E "?.10.*|?.11.*" (this fails)

The error given is: -

grep: ?, *, or + not preceded by valid regular expression

If I precede the ? symbols with spaces, it works. But why? I want to feed my script arbitrary extended regular expressions and not worry about it throwing these wierd errors. So what's causing them?

Cheers, Sy
12 REPLIES 12
Simon Hargrave
Honored Contributor

Re: Regular expression matching

(oh, and I will ultimately be doing these matches within awk, where I currently get the same error, but using grep -E just for illustration purposes)
H.Merijn Brand (procura
Honored Contributor

Re: Regular expression matching

in grep -E (and perl and awk) '.' (the dot) is magic, and not taken as literal

in perl, and your version of grep (there are sooo many versions of grep around that you cannot generalize) the '?' is a quantifier, meaning "zero or one occurance"

grep/awk/perl regexes are not even close to sh-regexes

echo "B.11.11" | grep -E '.\.10\..*|.\.11\..*'

is more likely to meet what you mean, but that's easier written with a character class instead of an alternation

echo "B.11.11" | grep -E '.\.1[01]\..*'

And since trailing .* is only time consuming (zero or more of any character will always match)

echo "B.11.11" | grep -E '.\.1[01]\.'

Should do the trick

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Simon Hargrave
Honored Contributor

Re: Regular expression matching

Thanks for that. Problem is though, I need to basically ask "is this piece of software installable on 11.00, 11.11, 11.23, 10.20 etc". So I need to parse the expressions given to me from swlist, which is the format I gave.

Is there another tool that will let me match against that format of expression?
H.Merijn Brand (procura
Honored Contributor

Re: Regular expression matching

1. swinstall won't let you install software that doesn't meet the system architecture

2. My statements still holds. You asked about the regular expression, and I explained. grep doesn't care what the input is from :)

swlist -options | grep pattern

a5:/u/usr/merijn 105 > swlist | grep '[ ][ABC]\.11'
B3901BA B.11.02.06 HP C/ANSI C Developer's Bundle for HP-UX 11.00 (S800)
B6192AA B.11.00.10 DCE/9000 Programming & Administration Tools Media and Manuals
B6733AA B.11.00.10 DCE/9000 Kernel Threads Support
HPUXEng64RT B.11.00 English HP-UX 64-bit Runtime Environment
HWE1100 B.11.00.0403.3 Hardware Enablement Patches for HP-UX 11.00, March 2004
J2793B B.11.00.07 High Performance X.25 Link software for HP 9000
OnlineDiag B.11.00.27.18 HPUX 11.00 Support Tools Bundle, Mar 2004
QPK1100 B.11.00.64.4 Quality Pack for HP-UX 11.00, March 2004
UnlimUserLic B.11.00.02 HP-UX Unlimited-User License
PHCO_26075 1.0 q4 patch version B.11.20f
PHCO_26823 B.11.00.16 HP Array Manager/60 cumulative patch
PHCO_28069 1.0 q4 patch version B.11.22l
PHSS_25714 1.0 Shared libF90 B.11.01.12
PHSS_28706 1.0 ANSI C compiler B.11.11.06 cumulative patch
Predictive C.11.00.24.01 HP Predictive Support
a5:/u/usr/merijn 106 >

in this command I entered '[' 'space' 'Ctrl-V Tab' ']' (to clear the forum compression of whitespace)

a5:/u/usr/merijn 106 > swlist | perl -ne '/\b[ABC]\.11/ and print'
B3901BA B.11.02.06 HP C/ANSI C Developer's Bundle for HP-UX 11.00 (S800)
B6192AA B.11.00.10 DCE/9000 Programming & Administration Tools Media and Manuals
B6733AA B.11.00.10 DCE/9000 Kernel Threads Support
HPUXEng64RT B.11.00 English HP-UX 64-bit Runtime Environment
HWE1100 B.11.00.0403.3 Hardware Enablement Patches for HP-UX 11.00, March 2004
J2793B B.11.00.07 High Performance X.25 Link software for HP 9000
OnlineDiag B.11.00.27.18 HPUX 11.00 Support Tools Bundle, Mar 2004
QPK1100 B.11.00.64.4 Quality Pack for HP-UX 11.00, March 2004
UnlimUserLic B.11.00.02 HP-UX Unlimited-User License
PHCO_26075 1.0 q4 patch version B.11.20f
PHCO_26823 B.11.00.16 HP Array Manager/60 cumulative patch
PHCO_28069 1.0 q4 patch version B.11.22l
PHSS_25714 1.0 Shared libF90 B.11.01.12
PHSS_28706 1.0 ANSI C compiler B.11.11.06 cumulative patch
Predictive C.11.00.24.01 HP Predictive Support
a5:/u/usr/merijn 107 >

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Muthukumar_5
Honored Contributor

Re: Regular expression matching

hai,

echo "B.11.11" | grep -E "?.10.*|?.11.*" will make problem. If you use the special characters as like ?,*,+ in the regexp.

Check regexp(5) man page for special characters. To sperate the ERE's on the regexp use the new line character \ or space in that.

echo "B.11.11" | grep -E " ?.10.* | ?.11.*"

It will work now.

Regards,
Muthukumar.
Easy to suggest when don't know about the problem!
Simon Hargrave
Honored Contributor

Re: Regular expression matching

Hmmm, I think I need to explain what I'm trying to achieve.

I know swinstall will only let it install software destined for it. What I'm doing is scripting the automatic updating of a software asset database. Basically swlist on all our servers, and populate a database with information about our software base.

Within the database, we have flags for each software saying whether it's "10.20", "11.00", "11.11" etc compatible. It can be one or more of these.

What I need to do is, for any given regexp extracted from swlist, as it whether it's compatible with each of those releases in turn, and set TRUE if it is.

So you see, I can't use different regexps, I can't really change them (though I guess I could do a sed to replace "?" with " ?" but would that solve it?)

See?

Cheers, Sy
john korterman
Honored Contributor

Re: Regular expression matching

Hi Simon,
the reason why echo
"B.11.11" | grep -E "?.10.*|?.11.*"

fails is that the -E specifies a so-called extended regular expression, and the question mark means - for an extended regular expression: one or more occurrances of the preceeding character. If there is no preceeding character, the expression fails, hence the need for a space (or another char.)

regards,
John K.


it would be nice if you always got a second chance
Simon Hargrave
Honored Contributor

Re: Regular expression matching

Hmm, I can see this isn't going to be easy, then.

According to the regexp manpage, as you say, "?" means "zero or one of the previous character", and "*" means "zero or more of the previous character".

However, in the patterns from swlist, the "*" means "anything" and the "?" means any one character (like "." in regexp), and "." actually should mean the "." character rather than "any one character", as in regexp.

Erk, this isn't going to be as easy as I'd hoped!

One (crude) solution I've found, is that if I create a temporary directory with files called B.10.20, B.11.00, B.11.11 etc, then do ls pattern_from_swlist, then it displays the releases that match. But this is crude, and doesn't work with the |, so this would need cutting end executing multiple times for each pattern.

There must be a cleaner way?

touch B.10.20 B.11.00 B.11.11 B.11.23
ls [AB].1?.*
john korterman
Honored Contributor

Re: Regular expression matching

Hi again,
have you considered something like this:

#!/usr/bin/sh
swlist -l patch | while read line
do
case "$line" in
*B.10.20*|*B.11.00*|*B.11.11*|*B.11.23*) echo "$line";
esac
done

regards,
John K.
it would be nice if you always got a second chance
Muthukumar_5
Honored Contributor

Re: Regular expression matching

hai,

To get the installed products with the o/s version category as like as 10.00 11.00 11.11 11.22 or 11.23,choose the regexp as like

11.23
swlist | grep -v "^#" | grep -v "^$" | grep -E "[A-Z]?\.11\.23" >> /tmp/11.23_products.log

11.22
swlist | grep -v "^#" | grep -v "^$" | grep -E "[A-Z]?\.11\.22" >> /tmp/11.22_products.log

Change the regexp pattern only.
"[A-Z]?\.11\.0"

To get 11.* version products
"[A-Z]?\.11\.[1-9]?"

[A-Z]? - 1 or more A-Z variable
\.11\. - .11. format after that
[1-9]? - 1 or more 1-9 digits

[A-Z] - we can use [[:upper:]] and
[1-9] - we can use [[:digit:]] classes as in regexp man page.

Regards,
Muthukumar
Easy to suggest when don't know about the problem!
Simon Hargrave
Honored Contributor

Re: Regular expression matching

Cheers for answers all, I've not figured a solution.

Basically I'm translating the "swlist expression" into a regexp by translating: -

all "." to "\." (fixed dots)
all "?" to "." (any single character)
all "*" to ".*" (zero or more "anythings")

This seems to work, giving regexps as follows which I can use in awk: -

* ----> .*
?.10.* ----> .\.10\..*
?.10.*|?.11.* ----> .\.10\..*|.\.11\..*
?.10.20 ----> .\.10\.20
?.11.* ----> .\.11\..*
?.11.00 ----> .\.11\.00
?.11.1* ----> .\.11\.1.*
?.11.11 ----> .\.11\.11
?.11.?? ----> .\.11\...
?.11.[01]* ----> .\.11\.[01].*
?.11.[01]? ----> .\.11\.[01].
?.1?.* ----> .\.1.\..*
?.1?.[0-9][0-9] ----> .\.1.\.[0-9][0-9]
B.10.00|B.10.01|B.10.10|B.10.20 ----> B\.10\.00|B\.10\.01|B\.10\.10|B\.10\.20
B.10.01|B.10.10|B.10.20 ----> B\.10\.01|B\.10\.10|B\.10\.20
B.10.10|B.10.20 ----> B\.10\.10|B\.10\.20
B.10.20 ----> B\.10\.20
B.11.00 ----> B\.11\.00
B.11.11 ----> B\.11\.11
[AB].1?.?? ----> [AB]\.1.\...
Rodney Hills
Honored Contributor

Re: Regular expression matching

If you are trying to build a list of products with the associated OS version for searches, then how about simplifying the OS version into a simple number. Here is a perl script that can do that.

%vers=("B.10.20",1,"B.11.00",2,"B.11.11",3,"B.11.20",4);
open(INP,"swlist -R|");
while() {
chomp;
/^(.).(\S+)\s+(\S+)\s+(.*)/;
next if $1 eq "#";
$name=$2; $ver=$3; $desc=$4;
if ($ix=$vers{substr($ver,0,7)}) { print join("\t",$ix,$name,$ver,$desc),"\n"; }
}

This script will parse the swlist output and classify the OS version. As long as you include the possible OS versions in variable "%vers" and map it to a unique number.

Then you get an output like this-
2 HPUXBaseAux.OBAM.OBAM-WEB B.11.00.05.3.06 Web Application Server Support
3 HPUXBaseAux.Judy-lib.JUDY B.11.11.04.13 Judy Library and Related files 20020319.213255 PA
3 HPUXBaseAux.Judy-lib.JUDY-COMMON B.11.11.04.13 Judy Library and Related COMMON files 20020319.213255
3 HWEnable11i B.11.11.0306.4 Hardware Enablement Patches for HP-UX 11i, June 2003

"2" is for 11.00 and "3" is for 11.11. The script only looks at the first 7 characters of the revision. Send the output of this script to a file and you can scan based on the mapped numbers for specific OS version.

HTH

-- Rod Hills
There be dragons...