Operating System - HP-UX
1747997 Members
4601 Online
108756 Solutions
New Discussion юеВ

Re: AWK script for more than 200 fields

 
OldSchool
Honored Contributor

Re: AWK script for more than 200 fields

Dennis > "Sure it can, it should be obvious what the message is and why. Though from the input, I don't see that many fields. "

I had one of those "Well...duh!" moments....

didn't particularly *look* at the input, only that the used $0 and a substring, and not $200, $250 or $NF. Oh well...

Just out of curiosity, I did run JRFs stuff thru GNU awk, which *didn't* have a problem with the file at all and produced correct results.

Then, had it '{print NF}' using the sample data posted and the default separators...max was 46. So the sample doesn't appear to be "representative"

of course, given that he *doesn't* reference anything but $0, the "-F" should certainly getting him going, one would think (at least until the length of a record becomes an issue)
Michael Mike Reaser
Valued Contributor

Re: AWK script for more than 200 fields

Urgh. Dennis, when I was in Atlanta, do you remember how I had a great propensity for "can't see the forest for the trees"? Yep, I still do it. :-P

Duhr. Duhr duhr duhr duhr. Duhr.
There's no place like 127.0.0.1

HP-Server-Literate since 1979
Aishwarya P
New Member

Re: AWK script for more than 200 fields

If I use gawk, it says gawk not found also there is man page entry for gawk.
How to upgrade the system to use gawk option:

The server specification:

HP-UX dhp0037 B.10.20 A 9000/851 2013207678 two-user license

Dennis Handly
Acclaimed Contributor

Re: AWK script for more than 200 fields

>How to upgrade the system to use gawk option:

gawk == gnu awk, which you said wasn't supported and now, not installed. (Did you look in /usr/local/bin/*awk?

This is no need to use gawk, if you just use -F"dummy separator".
OldSchool
Honored Contributor

Re: AWK script for more than 200 fields

"If I use gawk, it says gawk not found...."

which means that either "gawk" isn't installed, or that it can't be found in $PATH. But as noted previously, on several occasions, all you need to do is use the "-F" switch and set the field separator to something not used in the data(pehaps "|")

that should suffice to eliminate the "too many fields" error.
James R. Ferguson
Acclaimed Contributor

Re: AWK script for more than 200 fields

Hi:

While you can use any character or regular expression for your inter-field delimiter, I chose the _nul_ character as one that seems highly unlikely to be found in your data and thus the most likely to prevent 'awk' from splitting your input into too many fields.

This is the reason I wrote:

# awk -F"\000" '{print $1;print substr($0,1,2)}' /tmp/toomany

Regards!

...JRF...

Dennis Handly
Acclaimed Contributor

Re: AWK script for more than 200 fields

>JRF: While you can use any character or regular expression

Yes, that's why I suggested to use literally this long string: -F"dummy separator"

If your "fs" is more than one char, you go through the ERE engine. Your -F"\000" is an ERE.
James R. Ferguson
Acclaimed Contributor

Re: AWK script for more than 200 fields

Hi:

> Dennis: ...that's why I suggested to use literally this long string: -F"dummy separator" If your "fs" is more than one char, you go through the ERE engine. Your -F"\000" is an ERE.

OK, and do you say "ERE" because 'awk' supports the Posix ERE engine as opposed to the Posix RE engine?

Too, I could have (should have?) used simply :

"\0"

...in lieu of:

"\000"

Anyway, wouldn't the engine do _less_ work when attempting to match only one character then poteentially matching the "d" in "dummy..." and then having to assess the "u", before (e.g. finding none), bump to the next character in the input string and start all over?

Regards!

...JRF...
Dennis Handly
Acclaimed Contributor

Re: AWK script for more than 200 fields

>I could have (should have?) used simply: "\0"

That's still two chars.

>wouldn't the engine do _less_ work when attempting to match only one character

Yes but it still switches to the ERE path.
If you used a control-A, it would be almost as unique as that NUL.