1835067 Members
2113 Online
110073 Solutions
New Discussion

Re: Unix Scripting

 
SOLVED
Go to solution
Oktay Tasdemir
Advisor

Unix Scripting

Hi,
I have the following unix file which I would like to reformat it into a csv file as per the attachement.

Any ideas would be appreciated.

Password:
cpi_quarter cpi face_currency
-------------------------- -------------------- -------------
Dec 1 2002 12:00AM 139.500000 aud
Mar 1 2003 12:00AM 141.300000 aud
Jun 1 2003 12:00AM 141.300000 aud
Sep 1 2003 12:00AM 142.100000 aud

(return status = 0)


Thanks
Oktay
Let the fun and games begin
4 REPLIES 4
Hein van den Heuvel
Honored Contributor

Re: Unix Scripting

SMOP:

perl -n to-csv.pl < raw-data.dat > data.csv

where to-csv.pl is:

print "$1,$2,$3\n" if (/^(\w+)\s(\w+)\s(\w+)$/) ;
if (/^(\w{3})\s+(\d+)\s+20(\d\d).*\s(\d+\.\d+)\s+(\w+)/) {
printf ("%s-%s-%s,%.1f, %s\n",$2,$1,$3,0+$4,$5);
}


Ugly reg-expr, but not too hard.

first: find a line starting with (^) a word, and other word and a final ($) word, remembering each word ().


Next, find lines startign with ^ a 3 char word (remember it in $1), whitespace, 1-or-more decimals (the day, remembered in $2), whitespace and 20 followed immediatly by two decimals (the year, remembers in $3)
and so on...

Hein.
curt larson_1
Honored Contributor

Re: Unix Scripting

cat yourfile |
awk '
/^cp/ {print;next;}
/aud/ {printf("%s-%s-%s,%.1f, %s\n",$2,$1,substr($3,3),$5,$6);}'
Oktay Tasdemir
Advisor

Re: Unix Scripting

Thanks for the replies,

I have gone with Curt's solution as it looked somewhat easier.
The output that was produced was

1-Dec-02,139.5, aud
1-Mar-03,141.3, aud
1-Jun-03,141.3, aud
1-Sep-03,142.1, aud

I also require the headings as per below

cpi_quarter,cpi,face_currency
1-Dec-02,139.5, aud
1-Mar-03,141.3, aud
1-Jun-03,141.3, aud
1-Sep-03,142.1, aud

Would you be able to explain the awk script as then I might be able to figure it out.

Many thanks
Let the fun and games begin
Hein van den Heuvel
Honored Contributor
Solution

Re: Unix Scripting

My reply was a bit overly 'excessively protective.

Curt's a little under protective because it does not handle a day-in-the-month in other then 1-9 and only does one currency (aud).
It also fails to comma seperate the header.
Easily fixed:

awk 'BEGIN {OFS=","}/^cp/{print $1,$2,$3}
/[AP]M /{printf("%s-%s-%s,%.1f, %s\n",$2,$1,substr($3,3),$5,$6);}' < yourfile
cpi_quarter,cpi,face_currency
1-Dec-02,139.5, aud
1-Mar-03,141.3, aud
1-Jun-03,141.3, aud
1-Sep-03,142.1, aud

Try 'man awk'
BEGIN = do once, in the beginning
OFS = output field seperator
/^cp/ find a line starting (^) with cp
$1, $2,... input fields seperated by spaces.
/[AP]M / find a line with "AM " or "PM " anywhere.
substr ... just pick everything from teh 3rd char in the 3rd field.

For grins a more readable perl version:

x.pl:
print join (",",split)."\n" if /^cp/;
if (/[AP]M /) {
($mo,$da,$yr,$ti,$x,$cu)=split;
$yr =~ s/20//;
printf ("%s-%s-%s,%.1f,%s\n",$da,$mo,$yr,$x,$cu);
}

perl -n x.pl < yourdata

Hein.