Operating System - HP-UX
1833323 Members
3370 Online
110051 Solutions
New Discussion

Script needed to handle numbers with wildcards

 
SOLVED
Go to solution

Script needed to handle numbers with wildcards

Hi pals,

I've to write a script which needs to meet the
following requirement:
1. Script called 'Update'
2. will remove the entries of 7 digit numbers
in a file in the following fashion:
"0051", "004", ".....
"0051", "005", ".....
"0359", "001", ".......
"1130", "003", "....
: : : : : : :
: : : : : : :
: : : : : : :
"1140", "203", ".....
2. (interesting requirement) "Update" takes
the number as one of the argument, but the
number input will contain wildcard characters
"?" and "*"; will be in the following fashion:
0051* : This must match all 0051001 to 0051999
numbers
00?1001 : This must match 0001001, 0002001,
0003001.... 0009001 numbers
1*001 : This must match 1000001,
1001001,.....1999001 numbers
This is similar to unix filename handling,
with wildcards(metacharacters)....
4. The command line will looklike :
# Update 10?13*

Any perl(HP-UX 11.00 default), sed or awk
script is welcome. Points guaranteed!!!!..

thx
Ravi
13 REPLIES 13
Jdamian
Respected Contributor

Re: Script needed to handle numbers with wildcards

your script should parse the input pattern in order to substitute '?' and '*' chars for their equivalent patterns in 'grep', 'sed', or 'awk' (grep and awk can handle extended regular expressions, sed cannot).

I would convert input pattern to extended regular expressions:

print 00*51? | sed -e 's/\?/[[:digit:]]/p' -e 's/\*/[[:digit:]]+/p'

this will print the pattern to use in 'grep -E' or 'awk':

00[[:digit:]]+51[[:digit:]]

I assume that the meaning of '?' in a input pattern is 'just one numeric char' and the meaning of '*' is 'one or more numeric chars'.

Re: Script needed to handle numbers with wildcards

thanks Damian,
Your assumptions are right. That * represents
one more digits and ? represents one digit
Your solution works to some extent. The numbers
are of fixed length : 7. And your search can
give me numbers such as :
# cat ii
0000510
0099519
0095119
# grep -E '00[[:digit:]]+51[[:digit:]]' ii
0000510
0099519
0095119

But my for my search string(00*51?), I'm not
expecting "0095119", as this number must be
matched to "00*51??". Also the number size is
fixed to 7 digits only.

Hope my requirement is more clear now.
thanks,
ravi
Jdamian
Respected Contributor
Solution

Re: Script needed to handle numbers with wildcards

You should add the metacharacters used as anchors:
^ as first pattern char
$ as last pattern char

# grep -E '^00[[:digit:]]+51[[:digit:]]$' ii

As in your example, I assume the text file contains just only ONE 7-digits number in each line. If not, the pattern needs to be enhanced.

Francisco J. Soler
Honored Contributor

Re: Script needed to handle numbers with wildcards

Hi,
I need a more detailed explanation, you say that the script will remove the entries of 7 digit numbers, but in the example there aren't any 7 digit entries. The scrip must join two or more entries in the same number or not?

Frank
Linux?. Yes, of course.

Re: Script needed to handle numbers with wildcards

Damian's previous reply almost hit the bull's
eye. May be I'm bit greedy, I need the string
to match exactly(I'm poor in grep/awk/sed/perl)
the one I asked for the first time.

The search string must exactly match the following
patterns:
"0000", "510"
"0099", "519"

Damian, you can earn some more points!!!!

thx
Ravi

Ramkumar Devanathan
Honored Contributor

Re: Script needed to handle numbers with wildcards

Damian,

Since * refers typically to 0 or more occurences of a character, it'd better to replace it with [[:digit:]]* rather than [[:digit:]]+.

? however refers to a single character pattern and so it'd be replaced by [[:digit:]].

so it'd be thus -

$ cat nos.txt
12192,293823,2938292,3439483
12192,293823,2938292,3439483
12192,293823,2938292,3439483

$ cat Update
# Update
regex=`echo $1 | sed -e 's/\?/[[:digit:]]/gp' -e 's/\*/[[:digit:]]*/gp'`

awk -F',' '{for (i=1;i
# EOF


$ Update 29?82* nos.txt
2938292
2938292
2938292

$

HTH.

- ramd.
HPE Software Rocks!
H.Merijn Brand (procura
Honored Contributor

Re: Script needed to handle numbers with wildcards

Since you also ask for perl, here's Damian's equivalent:

# grep -E '^00[[:digit:]]+51[[:digit:]]$' ii

equals

# perl -ne'/^00\d+51\d$/ and print' ii

And Ramkumar's equiv of

--8<--- update
regex=`echo $1 | sed -e 's/\?/[[:digit:]]/gp' -e 's/\*/[[:digit:]]*/gp'`
awk -F',' '{for (i=1;i-->8---

could be
--8<--- update
#!/opt/perl/bin/perl -l
($pat=shift)=~s/\?/\\d/g;
$pat=~s/\*/\\d*/g;
while(<>){
for(split/,/){
/^\d{7}$/&&/$pat/ and print;
}
}
-->8---

or if you have a more recent perl (5.6.0 or up)

--8<--- update
#!/opt/perl/bin/perl -l
($pat=shift)=~s/\?/\\d/g;
$pat=~s/\*/\\d*/g;
$pat=qr/$pat/;
while(<>){
for(split/,/){
/^(?=\d{7}$)$pat/ and print;
}
}
-->8---

Enjoy, have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn

Re: Script needed to handle numbers with wildcards

Ramkumar, your suggestion of using * rather than
+ was fine. But your script didn't work for me.
And now I'm also looking solution which wiil work
for number patterns in the format:
"0001", "001", ....
"0005", "190", ....

Note that numbers are not continuous, but are
separated by group of 4 and 3.

rgds,
Ravi
Ramkumar Devanathan
Honored Contributor

Re: Script needed to handle numbers with wildcards

Ravi,

It'd be good if you could in a shot, tell us what all you exactly require to do.

Scope expansions in terms of customer needs I've seen, but that's not really great if you want to get things done this way... ;)

I understand that my script wouldn't have worked for you because I found a way to parse and weed out all the strings of length 7 characters which match the given pattern - of course with Damian's help - and print 'em on screen. these strings will not get removed from the text file.

I understand you are an scripting newbie and I trust you will really be interested in learning awk/perl/shell/sed - you can start now and here - coz you've got examples in all the above scripting languages.

- ramd.
HPE Software Rocks!
Jdamian
Respected Contributor

Re: Script needed to handle numbers with wildcards

Ravi,

Ramkumar wrote

"Since * refers typically to 0 or more occurences of a character..."

and you agreed.

But at the end of my fist reply I wrote

"the meaning of '*' is 'one or more numeric chars"

and you agreed too.

Ravi, another issue -- you wrote

" Damian, you can earn some more points!!!! "

If you think I'm replying your post for a few miserable points, you are wrong.

I think you posted this thread to make somebody write down the "Update" script for you.

My responses only tried to be a guideline for you... I hoped you be able to write down your script by yourself.

Re: Script needed to handle numbers with wildcards

Damian, I'm sorry if I've hurt you by that comment.
May be I was telling people, I'm honest in awarding
points, rather like some people get their solution
from the forum and never even think about saying
thanks!!!. I'm really sorry for that and will
take care next time.

Sorry Procura, the perl flavor, I'm having here on
my HP-UX 11.00 box doesn't work for all your scripts.
My version of perl is:
----------------------------------------------
# what /usr/contrib/bin/perl
/usr/contrib/bin/perl:
mathd_atan.s $Revision: 1.18 $
mathd_cossin.s $Revision: 1.19 $
mathd_log.s $Revision: 1.23 $
cd_error.c $Revision: 1.33 $
9.X nsswitch patch Rev B
----------------------------------------------
Ramkumar, U R right; I'm new to scripting. I can say
I'm improving, since I've registered to this forum.
Well, pals thanks for all your replies. I almost
got the workable solution here. Thanks a lot for
all your time.
Ramkumar Devanathan
Honored Contributor

Re: Script needed to handle numbers with wildcards

hi Ravi,

Install perl from this location - it says install for free....

http://www.software.hp.com/cgi-bin/swdepot_parser.cgi/cgi/displayProductInfo.pl?productNumber=PERL&oper=install

don't use the perl in /usr/contrib/bin/perl - that's as you found out, an old version. wonder why at all they ship it.

- ramd.
HPE Software Rocks!
H.Merijn Brand (procura
Honored Contributor

Re: Script needed to handle numbers with wildcards

precompiled perl is available from many places for 11.00 and 11i. I've got perl 5.8.0 ports for 10.20 and 11.00 on http://www.cmve.net/~merijn or https://www.beepz.com/personal/merijn, but the worldwide HP porting centers (http://hpux.connect.org.uk/hppd/hpux/Languages/perl-5.8.0/) also do a good job. Main difference between the ports are the location of installation (/opt/perl vs. /usr/local) and the modules that come with the shipment.

/usr/contrib/bin is perl version 4 and that is soooo historical that it is not worth any trouble. *Any* perl example given in this forum, not only from me, is bound to fail in perl4. Install perl5 and enjoy it's rich set of features.

Enjoy, have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn