Operating System - HP-UX
1752302 Members
5063 Online
108786 Solutions
New Discussion юеВ

Re: AWK script for more than 200 fields

 
Aishwarya P
New Member

AWK script for more than 200 fields

The application which we run is very old and it uses an awk script to split a huge file into separate file based on the first two columns (like 01, 02, 03, 04 ...)Now due to some enhancement the no. of fields in the input file have increased and its nearly 250, so the awk script fails. Someone please help with an alternative code. I have provided the current script below:

#
awk ' BEGIN {
hold_tbl_code = "00" # initialise hold tbl code
}
{
tbl_code = substr($0,1,2) # set variables
#
if (substr(tbl_code,2,1) != " ")
{
if (tbl_code == hold_tbl_code) # if tbl code = old tbl code
{
print $0 > tbl_filename # print record to file
}
else
{
if (hold_tbl_code == "00") # if first dealer
{ # continue
}
else
{
close(tbl_filename) # close previous tbl file
}
#
tbl_filename = sprintf("%32s%2s%4s",\
"/apps/production/visa/data/table",\
tbl_code,".dat") # set tbl filename variable
#
print $0 > tbl_filename # print data to file
#
hold_tbl_code = tbl_code # move tbl code to hold dlr code
}
}
} '
#
18 REPLIES 18
Dennis Handly
Acclaimed Contributor

Re: AWK script for more than 200 fields

Possibly use gnu awk?

I also don't see you using any fields. So you could use -Fx, where "x" is a char that doesn't appear or appear often in your data.
Aishwarya P
New Member

Re: AWK script for more than 200 fields

It is a very old system which doesnot support gnu awk. Is there any other option to split the file. I'll give a sample format of in put file. the is split as table01, table02, table03.... based on the first two columns such as 01, 02, 03, 04.... All the 01 records are put into table01, 02 records into table02 etc. Now for each record for 03 has increased to 250.

01;VE;003301;020;2009_01_20;045502226;0280;2009_01_20;;0009;0;LS;;;2009_04_28;;N;A;;4373503;N;N;Y;2009_03_06;2009_04_06;2009_04_13;2009_04_07;2009_04_28;;;;;VL999 137777 06/03 20/04/09;
02;045502226;VE;8EZ19;538;;L331286;6G1MZ55Y69L331286;P;DOM;;690F;80U;PHANTOM BLACK;;;;51I;;;
03;VE;045502226;463 A45 A88 AGA AGB AHU AQ9 AW5 AX2 AY0 B13 B34 B35 BMK BSI C63 CE1 CX5 DL1 E20 EOF ESA EVG FE1 GW8 JL4 JL9 L76 MYC N10 N40 N87 NK4 NT3 P34 QWD RHD T81 TT7 U32 U71 UFR UWE UWQ V7O XW6;
04;045502226;VE;18324016;;;;SBP082660042;2099A1;92R003441;01575609;619CVA2015;;
05;045502226;93254552700;;;2B-61232;S0813;
07;045502226;0;VE;;2009_04_21;DELV;2009_04_07;;;;;2009_04_01;HBD OPTIONS=DLOK;
08;045502226;PREF;2009_02_16;;VE;
08;045502226;RELD;2009_03_03;;VE;
08;045502226;SCHD;2009_03_16;;VE;
08;045502226;CHK1;2009_04_06;1;VE;
08;045502226;CHK2;2009_04_06;1;VE;
08;045502226;CHK3;2009_04_07;1;VE;
08;045502226;CHK4;2009_04_07;1;VE;
08;045502226;CHK5;2009_04_07;1;VE;
08;045502226;CHK6;2009_04_07;1;VE;
08;045502226;BILT;2009_04_07;;VE;
08;045502226;TRAN;2009_04_07;;VE;
08;045502226;DELV;2009_04_07;;VE;
09;045502226;463;N;
09;045502226;A45;N;
09;045502226;A88;N;
09;045502226;AGA;N;
09;045502226;AGB;N;
09;045502226;AHU;N;
12;045502226;VE;VL999;;;;;;0280;;0280;VIC;;VIC;;;;GELZ;;
14;045502226;2009_01_19;0009;
14;045502226;2009_04_06;0280;
20;045502226;GELZ;GELZ;;;;;;E;Y;N;N;N;N;N;;0;;;N;;;;;;
01;VE;003306;304;2009_02_03;045504066;0280;2009_02_03;;0252;0;LS;;;2009_04_28;;N;A;;4372186;N;N;N;;2009_04_13;2009_04_22;2009_04_07;2009_04_28;;;;;VW000 487415 03/02 10/04/09 CHRIS JURESKO PET NET;
02;045504066;VE;8EX35;C55;;L331241;6G1EX85729L331241;P;DOM;;690F;80U;PHANTOM BLACK;;;;51I;;;
03;VE;045504066;463 A45 A88 AG3 AH8 AHU AQ9 AX2 AY0 B13 B34 B35 BMK BS2 BSI CE1 CJ2 CX5 DL1 E20 ESA EVF FE1 GW8 JL4 JL9 LY7 M82 N10 N40 N65 N87 NK4 NT3 QI7 RHD U32 U71 UFR UWE UWP V7O XW6 YE3;
04;045504066;VE;18273309;;;;LY7090770452;2209A1;92R003379;01576409;619HKGS219;;
05;045504066;93180665699;;;2B-61192;S0831;
07;045504066;0;VE;;2009_04_21;DELV;2009_04_07;;;;;2009_04_01;HBD OPTIONS=DCAR;
08;045504066;PREF;2009_02_16;;VE;
08;045504066;RELD;2009_03_03;;VE;
08;045504066;SCHD;2009_03_16;;VE;
08;045504066;CHK1;2009_04_06;1;VE;
08;045504066;CHK2;2009_04_06;1;VE;
08;045504066;CHK3;2009_04_07;1;VE;
08;045504066;CHK4;2009_04_07;1;VE;
08;045504066;CHK5;2009_04_07;1;VE;
08;045504066;CHK6;2009_04_07;1;VE;
08;045504066;BILT;2009_04_07;;VE;
08;045504066;TRAN;2009_04_07;;VE;
08;045504066;DELV;2009_04_07;;VE;
09;045504066;463;N;
09;045504066;A45;N;
09;045504066;A88;N;
09;045504066;AG3;N;
09;045504066;AH8;N;
Dennis Handly
Acclaimed Contributor

Re: AWK script for more than 200 fields

>Is there any other option to split the file?

Nothing to split, you don't have any fields other than $0 and one you create with substr.

As I said, you can use -F: or possibly -F"" to fool awk. Or -F"dummy separator".
Aishwarya P
New Member

Re: AWK script for more than 200 fields

No there is no other option. The split of file happens only with the first 2 columns ie 01, 02, 03, 04 ... All these are separate records. All the 01 records will be sorted and sent to table01.dat, similarly 02 , 03, 04... all the records will be stored into different dat files
James R. Ferguson
Acclaimed Contributor

Re: AWK script for more than 200 fields

Hi:

If you can't install GNU 'awk' then what about using Perl? Perl doesn't suffer from arbitrary limits.

Regards!

...JRF...
OldSchool
Honored Contributor

Re: AWK script for more than 200 fields

As Dennis noted, you don't appear to be using "fields", you only have $0 and you do a substring.

so..."no. of fields in the input file have increased and its nearly 250, so the awk script fails." ... can't be an accurate description of the problem.

What error message you are getting???

according to "Sed & Awk", approximations of some common limits for older awks, while implementation specific are:

# fields / record: 100
# characters / record (in or out): 3000
# characters / field: 1024
# characters in printf string: 1024
# files open: 15

I suspect you're bumping in to one of the above, but insufficient information

as far as "very old system which does not support gnu awk."? I take this to mean, "its not installed, can't find pre-compiled depot". If you've got a half-way decent compiler, you can build it.










James R. Ferguson
Acclaimed Contributor

Re: AWK script for more than 200 fields

Hi:

Actually Dennis has given you the easiest solution! As he said, use a dummy record separtor to stop 'awk' from splitting and tallying more the 199 fields in the process.

To demonstate, may a file with too many fields for 'awk' to handle:

# perl -e 'for (1..500) {printf "f%s ",$_};print "\n"' > /tmp/toomany

# awk '{print $1;print substr($0,1,2)}' /tmp/toomany
awk: Line f1 f2 f3 f4 f5 f6 f7 cannot have more than 199 fields.

...now impose a dummy field seperator as Dennis suggested:

# awk -F"\000" '{print $1;print substr($0,1,2)}' /tmp/toomany
f1

...which works...

Of course, the next limit you will bounce off is a line length > 3,000 characters :-)

Regards!

...JRF...
Michael Mike Reaser
Valued Contributor

Re: AWK script for more than 200 fields

I'm with OldSchool - what is the specific awk command you're issuing to get the above program to run, and what error messages are being output?

"...so the awk script fails" doesn't tell us HOW it failed, so to have any idea of how to make it Not-Fail we need to know how it's failing NOW.
There's no place like 127.0.0.1

HP-Server-Literate since 1979
Dennis Handly
Acclaimed Contributor

Re: AWK script for more than 200 fields

>OldSchool: can't be an accurate description of the problem. What error message you are getting?

Sure it can, it should be obvious what the message is and why. Though from the input, I don't see that many fields.

>Michael: what is the specific awk command you're issuing to get the above program to run, and what error messages are being output?
>"...so the awk script fails" doesn't tell us HOW it failed

Sure it does. And you can do the experiment that JRF proposed.