1753521 Members
4745 Online
108795 Solutions
New Discussion юеВ

Re: Double-Quotes In awk

 
SOLVED
Go to solution
Michael Mike Reaser
Valued Contributor

Double-Quotes In awk

I have been handed a comma-separated file whose records look like

"124","124",,60.00,60.00,60.00,0.00,0.45,60.45,0.059000,"APP","00","EXC"

I need to awk or sed this stuff so its output looks like

"124","124","","60.00","60.00","60.00","0.00","0.45","60.45","0.059000","APP","00","EXC"

In other words, if a "field" is already surrounded by double-quotes, pass it thru. If the field is *NOT* surrounded by double-quotes, I need to output them doing so. And, if a field between commas is empty, I need to parse out a placeholder set of quotes.

I know my awk command would be something of the form

awk -F, -f [progfile]

and that [progfile] will of the form

{
for(i=1;i<=NF;i++)
{
if(there's a quote already)
{printf("%s,",$i)}
else
{printf("\"%s\",",$i)}
}
printf("\n")
}

What's got me stumped is how to detect that field N already begins and ends with a double-quote. Pretty much anything I've tried

if(match($i,/\"/)!=0)

if($i~/^'"'/)

if($i~/^'"'/)

has caused awk to barf on another statement, presumably because I have unbalanced quotes where I'm not escaping something correctly (perhaps in that last mondo printf statement?).

Unfortunately, perl is not an option for me, and I'm not a high-enough muckety-muck to insist that it be installed. Therefore, I'm thinking I'm going to have to stick with awk, unless someone has some sed or other text-processing capital-m Magic they can share.

Help?
There's no place like 127.0.0.1

HP-Server-Literate since 1979
7 REPLIES 7
Dennis Handly
Acclaimed Contributor
Solution

Re: Double-Quotes In awk

>is how to detect that field N already begins and ends with a double-quote.

You can use substr to extract chars from strings:
if (substr($i,1,1) == "\"")
James R. Ferguson
Acclaimed Contributor

Re: Double-Quotes In awk

Hi Mike:

Try:

# awk '{gsub("\,\,", ",\"\",");print}' file

Regards!

...JRF...
Michael Mike Reaser
Valued Contributor

Re: Double-Quotes In awk

>You can use substr to extract chars from strings

D'oh. Thank you, Dennis. As you may or may not recall, I have quite a knack for figuring out the Most Complicated Way Possible to accomplish tasks. Cutting thru the muck and just testing substr($i,1,1) and substr($i,1,length($i)) for quotes Did The Trick.
There's no place like 127.0.0.1

HP-Server-Literate since 1979
Hein van den Heuvel
Honored Contributor

Re: Double-Quotes In awk

Are there potentially embedded commas or double-quotes in the quoted strings?

If not, just strip all double-quotes, replace commas with "," and start and end with a doublequote.

Are you sure perl is not simple on the box?
Anyway...

$ cat tmp.txt
"124","124",,60.00,60.00,60.00,0.00,0.45,60.45,0.059000,"APP","00","EXC"

$ awk '{gsub(/"/,""); gsub(/,/,"\",\""); print "\"" $0 "\""}' tmp.txt
"124","124","","60.00","60.00","60.00","0.00","0.45","60.45","0.059000","APP","00","EXC"

hth,
Hein.

$ perl -pe 's/"//g;s/,/","/g;$_; s/^/"/; s/.$/"/' tmp.txt
"124","124","","60.00","60.00","60.00","0.00","0.45","60.45","0.059000","APP","00","EXC"
Michael Mike Reaser
Valued Contributor

Re: Double-Quotes In awk

JRF:> awk '{gsub("\,\,", ",\"\",");print}' file

Thankee! That *almost* got me there, but missed putting quotes around the non-quoted numeric fields. :-/ However, Hein's suggestion worked like a charm. :-)

(And I know the "no Perl" stuff had to have driven you nutty. It's not got me all that happy, myself, but I've got to work within the handcuffs I've been handed...)
There's no place like 127.0.0.1

HP-Server-Literate since 1979
Michael Mike Reaser
Valued Contributor

Re: Double-Quotes In awk

Hein:> Are there potentially embedded commas or double-quotes in the quoted strings?

Double-quotes, no. Commas, it doesn't *appear* that commas are An Issue, but all I've been handed thus far is "toy data".

Hein:> Are you sure perl is not simple on the box?

Unfortunately, yes, I'm sure. :-( :-( :-(
There's no place like 127.0.0.1

HP-Server-Literate since 1979
Michael Mike Reaser
Valued Contributor

Re: Double-Quotes In awk

A combination of Dennis's, JRF's and Hein's suggestions got me over the hump. As always, thank you!!!
There's no place like 127.0.0.1

HP-Server-Literate since 1979