Operating System - HP-UX
1833474 Members
3447 Online
110052 Solutions
New Discussion

regular expressions sed and awk

 
SOLVED
Go to solution
Fred Martin_1
Valued Contributor

regular expressions sed and awk

I need to use sed or awk to remove characters from a string. Basically, I want everything removed that's not a letter, number, or space.

So this string:

Hill, "Lefty" Brad (and company)

Becomes:

Hill Lefty Brad and company

No need to worry about other white space, there aren't any tabs etc.

I want to keep: A-Za-z0-9
and the space (decimal 32) and the rest can go.

Any way of doing this without explicitly naming 'the rest'?
fmartin@applicatorssales.com
6 REPLIES 6
Hein van den Heuvel
Honored Contributor
Solution

Re: regular expressions sed and awk

echo 'Hill, "Lefty" Brad (and company)' | tr -c -d [^A-Za-z0-9\ ]

Hein.
Biswajit Tripathy
Honored Contributor

Re: regular expressions sed and awk

I think Hein's solution using 'tr' is a lot compact and
nice, but if you must do it in sed, here is how you do
it:

str1='Hi, "Lefty" Brad (and company)'
str2=$(echo $str1 | sed 's/[A-Za-z0-9 ]//g')
str3=$(echo $str1 | sed "s/[$str2]//g")
echo $str3

You could read str1 from your input file directly.

- Biswajit
:-)
Biswajit Tripathy
Honored Contributor

Re: regular expressions sed and awk

Or make it a little more compact :

str1='Hi, "Lefty" Brad (and company)'
echo $(echo $str1 | sed "s/[$(echo $str1 | sed 's/[A-Za-z0-9 ]//g')]//g")

- Biswajit
:-)
Fred Martin_1
Valued Contributor

Re: regular expressions sed and awk

Nicely done with 'tr' - I was unaware of those switches so I hadn't considered using it.

However, the command appears to also remove the carriage return at the end of the string.

How can that be prevented?
fmartin@applicatorssales.com
H.Merijn Brand (procura
Honored Contributor

Re: regular expressions sed and awk

Easy, add it to the list of characters to keep:

lt09:/home/merijn 122 > cat xx.txt
Hill, "Lefty" Brad (and company)
lt09:/home/merijn 123 > tr -c -d '[A-Za-z0-9 ]' < xx.txt
Hill Lefty Brad and companylt09:/home/merijn 124 > tr -c -d '[A-Za-z0-9 \n]' < xx.txt
Hill Lefty Brad and company
lt09:/home/merijn 125 >

Is perl an option too?

lt09:/home/merijn 126 > perl -ple's/[^\w ]+//g' < xx.txt
Hill Lefty Brad and company
lt09:/home/merijn 127 >

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Fred Martin_1
Valued Contributor

Re: regular expressions sed and awk

Ahh, the \n thing did the trick. Many thanks.
fmartin@applicatorssales.com