Operating System - HP-UX
1845947 Members
3751 Online
110250 Solutions
New Discussion

Re: Awk scripting problem - columns

 
SOLVED
Go to solution
Luke Morgan
Frequent Advisor

Awk scripting problem - columns

Hi,
I have a data file with a number of records, each with two fields.
One of the fields is an index number, and the other is a quantity value.
I need to put the data in a result file that has a large number of
columns (around 200).

By specifying NF, I can create the number of fields I want in my output
file, but I need to be able to specify that one of the input
values corresponds to a column number and insert its quantity value
in that column.

ie
data column quantity
1 4
4 2

would give a result file (using | delimiter)
4|||2|||

How can I specify the column number using a variable number?

thanks.

Luke
27 REPLIES 27
Steve Steel
Honored Contributor

Re: Awk scripting problem - columns

Hi

I am not sure that awk will do what you want but

http://sparky.rice.edu/~hartigan/awk.html

May help your search.


Try an array in a shell script with the array set to "|" in every value and then just fill up
the variables with number| where relevant.

see man ksh



Regards

Steve Steel

Quote of the moment
-------------------
"We are drowning in information but starved for knowledge."
-- John Naisbitt
If you want truly to understand something, try to change it. (Kurt Lewin)
Jean-Louis Phelix
Honored Contributor

Re: Awk scripting problem - columns

Hi,

This one fixes number of columns to greatest index :

#! /usr/bin/sh
awk '
BEGIN {
max=0
}
{
t[$1]=$2
if (max < $1)
max=$1
}
END {
for (i=1 ; i <= max ; i++)
if (t[i] == 0)
printf("|")
else
printf("%d|", t[i])
print
}' $1

and this one fixes number of columns as first argument :

#! /usr/bin/sh
awk -v NB=$1 '
{
t[$1]=$2
}
END {
for (i=1 ; i <= NB ; i++)
if (t[i] == 0)
printf("|")
else
printf("%d|", t[i])
print
}' $2

if file contains :

1 4
4 8

> script1 infile
4|||8|
> script2 10 infile
4|||8|||||||

Regards
It works for me (© Bill McNAMARA ...)
Robin Wakefield
Honored Contributor

Re: Awk scripting problem - columns

Luke,

Not sure if this is what you want, but assuming your output is 20 columns wide:

awk '{a[$1]=$2}END{for (i=1;i<20;i++){printf ("%s|",a[i])};print}' filename

Rgds, Robin
Christian Gebhardt
Honored Contributor

Re: Awk scripting problem - columns

Hi
what about this:

awk '{max=10;for (i=1;i<=max;i++)
{
if ($1==i)
{ printf("%s|",$2)}
else
{ printf ("|")}
if (i==max) printf("\n")
}}' file

file looks like that:

1 4
4 2
3 3
2 5

output looks like that:

4||||||||||
|||2|||||||
||3||||||||
|5|||||||||

Chris
Luke Morgan
Frequent Advisor

Re: Awk scripting problem - columns

Christian,

That is great, thanks.
What I really need, is to get all of them on the same output line. (they are unique index numbers)

ie
instead of
4|||||||
|||2||||
||3|||||
|5||||||

it reads
4|5|3|2||||||


Luke
Vincent Stedema
Esteemed Contributor

Re: Awk scripting problem - columns

Hi Luke,

Using awk:

cat | sort -0 +1 -b | awk 'BEGIN {ORS="\|"} {print $2} END {ORS="\n"}' -

I can't guarantee that this version will work, as I don't have access to a unix/linux box at the moment....

Using perl:

perl -wle "open(FILE, $ARGV[0]) || die qq(can't open file); foreach(){ ($num,$val) = ($1 - 1,$2) if $_ =~ /^(\d+)\s+(\d+)$/; $fields[$num] = $val; }
print join('|', @fields)"

Regards,

Vincent
Vincent Stedema
Esteemed Contributor

Re: Awk scripting problem - columns

BTW, you might want to ditch the perl "-w" switch so it won't complain about the uninitialized array elements.
john korterman
Honored Contributor

Re: Awk scripting problem - columns

Hi Luke,

for this input file:
1 4
4 2
6 14
3 3
9 23
2 5

you can get this output:
|4|5|3|2||14|||23

by running the attached script with the input file as $1

hope it helps,
John K.

it would be nice if you always got a second chance
H.Merijn Brand (procura
Honored Contributor
Solution

Re: Awk scripting problem - columns

And a shorter and simpler version of Vincent's good solution (Vincent, use magic open)

# cat data_file
1 4
4 2
3 3
2 5
# perl -nle'/(\d+)\D+(\d+)/ and$x[$1-1]=$2}END{$"="|";print"@x"' data_file
4|5|3|2
#
Enjoy, Have FUN! H.Merijn
Christian Gebhardt
Honored Contributor

Re: Awk scripting problem - columns

Hi
just for info zhe solution in awk

sort file | awk '{zeile[NR]=$0;maxfield=$1}
END {
run=1
for (i=1;i<=maxfield;i++)
{
split(zeile[run],feld)
if ( i == feld[1])
{
printf("%s|",feld[2])
run++
}
else printf("|")
}
}'

perl is much more elegant than this.

Chris
Luke Morgan
Frequent Advisor

Re: Awk scripting problem - columns

Thank you all very much for your suggestions.

The perl script from procura
works perfectly although I know
nothing about perl so I hope I dont have to debug it
before I get chance to read up!
:o)

Thanks again

Luke
H.Merijn Brand (procura
Honored Contributor

Re: Awk scripting problem - columns

can still save three keys on that :)

# perl -nle'/(\d+)\D+(\d+)/and$x[$1-1]=$2}END{$,="|";print@x' data_file
4|5|3|2
#
Enjoy, Have FUN! H.Merijn
H.Merijn Brand (procura
Honored Contributor

Re: Awk scripting problem - columns

using the features you should not, even shorter ...

# perl -nle'/\D+(\d+)/and$x[$`-1]=$1}END{$,="|";print@x' data_file
4|5|3|2
#

Note that we cannot do

# perl -nle'/\D+/and$x[$`-1]=$'}END{$,="|";print@x' data_file

because $' will end the -e expression
Enjoy, Have FUN! H.Merijn
Luke Morgan
Frequent Advisor

Re: Awk scripting problem - columns

Is it possible to use that perl script in a loop to do
a number of datafiles in one go?
And is it possible to maintain the sign of the numbers?
ie keep -4 in the data as -4

thanks

Luke
H.Merijn Brand (procura
Honored Contributor

Re: Awk scripting problem - columns

Sure, hold on ...

l1:/tmp 110 > perl -le'$,="|";for(@ARGV){open X,"<$_"or die"$_:$!";@x=();while(){/(\d+)\D+?(-?\d+)/and$x[$1-1]=$2}print@x}' data_file1 data_file2
4|5|3|2
5|-12|-99|8000
l1:/tmp 111 > cat data_file1
1 4
4 2
3 3
2 5
l1:/tmp 112 > cat data_file2
3 -99
1 5
4 8000
2 -12
l1:/tmp 113 >

this what you want?
Enjoy, Have FUN! H.Merijn
Luke Morgan
Frequent Advisor

Re: Awk scripting problem - columns

I tried that on the command line and I got an error:

Modification of non-creatable array value attempted, subscript -1 at -e line 1, line 1.

Unfortunately I don't know a thing about perl so I have no
idea what the problem is.

The looping I need to do is because I have a directory with
a number of files in of the same format. I need to take each
file and change its format in the way that the first perl script
you gave me did.

I was thinking a simple script along the lines of
for a in `ls `
do
perl
done

I tried this loop using the first script you gave (that works
perfectly on the command line) and put it into the loop and
got the same error as the looping perl script...


Luke
H.Merijn Brand (procura
Honored Contributor

Re: Awk scripting problem - columns

l1:/tmp 102 > perl5.00503 -le'$,="|";for(@ARGV){open X,"<$_"or die"$_:$!";@x=();while(){/(\d+)\D+?(-?\d+)/and$x[$1-1]=$2}print@x}' data_file1 data_file2
4|5|3|2
5|-12|-99|8000
l1:/tmp 103 > perl5.6.1 -le'$,="|";for(@ARGV){open X,"<$_"or die"$_:$!";@x=();while(){/(\d+)\D+?(-?\d+)/and$x[$1-1]=$2}print@x}' data_file1 data_file2
4|5|3|2
5|-12|-99|8000
l1:/tmp 104 > perl5.8.0 -le'$,="|";for(@ARGV){open X,"<$_"or die"$_:$!";@x=();while(){/(\d+)\D+?(-?\d+)/and$x[$1-1]=$2}print@x}' data_file1 data_file2
4|5|3|2
5|-12|-99|8000
l1:/tmp 105 > perl5.9.0 -le'$,="|";for(@ARGV){open X,"<$_"or die"$_:$!";@x=();while(){/(\d+)\D+?(-?\d+)/and$x[$1-1]=$2}print@x}' data_file1 data_file2
4|5|3|2
5|-12|-99|8000
l1:/tmp 106 >

For a change I realy tested the perl examples I posted. What is your perl version? (perl -v)

Show me the failing script.
Enjoy, Have FUN! H.Merijn
Luke Morgan
Frequent Advisor

Re: Awk scripting problem - columns

Perl version is 5.6.0 for i386-linux.

The simple script Im trying to use to run your perl
against all the files in the directory is this:

for a in `ls`
do
perl -le'$,="|";for(@ARGV){open X,"<$_"or die"$_:$!";@x=();while(){/(\d+)\D+?(-?\d+)/and$x[$1-1]=$2}print@x}' $a > $a.res
done

Also, when I run the perl script (its the version 5.6.1 copy from your last post) on the command line, i still get the error about modification of a non-creatable array.

Luke
H.Merijn Brand (procura
Honored Contributor

Re: Awk scripting problem - columns

as I showed that 5.005_03 is good enough, I guess that 5.6.0 is good enough too. Any chance to upgrade to 5.6.1 or 5.8.0? Both are much better than 5.6.0, but that aside.

In the script you show, you /mix/ the all-in-one solution and the one-by-one solution. To play safe, and readable, do:

for a in `ls` ; do
perl -nle'/(\d+)\D+(\d+)/and$x[$1-1]=$2}END{$,="|";print@x' $a >$a.res
done

Show me the exact error.

Be careful with my posts at the moment. I'm using the latest Opera 7 beta, which removes even more whitespace than the forum already does :/

Enjoy, Have FUN! H.Merijn
Luke Morgan
Frequent Advisor

Re: Awk scripting problem - columns

The exact error is:

Modification of non-creatable array value attempted, subscript -1 at -e line 1, <> line 1.

With regards to the whitespace, I am only putting a space at the beginning and
the end, between perl -nle, and print@x' $a.
Should there be more?

Luke
H.Merijn Brand (procura
Honored Contributor

Re: Awk scripting problem - columns

Ahh, index -1 is the last element of the array, which in this case does not exist (yet), because it is the first line processed. Let's catch it ...

for a in `ls` ; do
echo processing $a ...
perl -nle'/(\d+)\D+(\d+)/&&$1 and$x[$1-1]=$2}END{$,="|";print@x'$a >$a.res
done

If you rather skip the bad lines, do

for a in `ls` ; do
echo processing $a ...
perl -nle'/(\d+)\D+(\d+)/&&$1 or next;$x[$1-1]=$2}END{$,="|";print@x'$a >$a.res
done

If you rather skip the file entirely on these bad input, do

for a in `ls` ; do
echo processing $a ...
perl -nle'/(\d+)\D+(\d+)/&&$1 or exit;$x[$1-1]=$2}END{$,="|";print@x'$a >$a.res
done
Enjoy, Have FUN! H.Merijn
Luke Morgan
Frequent Advisor

Re: Awk scripting problem - columns

Excellent.
Thank you all very much for your help and suggestions.
Special thanks to procura.

Luke
Luke Morgan
Frequent Advisor

Re: Awk scripting problem - columns

Having run the script now for a little while,
a small bug has appeared. It is not maintaining the sign
of the numbers involved.
Any numbers, negative or positive, are made positive
by the perl script.
Is it possible to change the
script so that it maintains
the sign of the data?

Thanks

Luke
H.Merijn Brand (procura
Honored Contributor

Re: Awk scripting problem - columns

Changing only the first example, that would be:

for a in `ls` ; do
echo processing $a ...
perl -nle'/(\d+)\D+(-?\d+)/&&$1 and$x[$1-1]=$2}END{$,="|";print@x'$a >$a.res
done

given that columns cannot be negative.

Note: in perl columns *can* be negative, inwhich case they count from the tail: $list[-1] is the last element of @list.
Enjoy, Have FUN! H.Merijn