Data formating

Victor Pavon · ‎02-16-2004

Hello:

I was put to the task of formating data blocks from our legacy system to a flat ascii format. I have created a simple script that worked great until the programmers started sending huge data records beyond the old system maximun of 756 characters. We have agreed to insert an undescore (_) on position 756 indicating that the next line were to be continuation of the same record. So, I came to the idea of searching data until it finds an undescore and a line feed (_\012) and delete the pair, hence concatenating the next line. This was not as simple as I had planned. See model script bellow:
...
# Split data stream into 756 chunks and delete trailing spaces
fold -b -w756 $1 | sed 's/;[ ]*/;/g' > $1.foo
# replace '_' with 'u' and '\012' with 'l'
sed 's/_/u/g' $1.foo | tr '\012' 'l' > $1.mid
# add an end of record for sed to see bottom of file
echo '' >> $1.mid
# Delete 'ul' (concactenate), add linefeed and replace the _ were they should be
sed 's/ul//g' $1.mid | tr 'l' '\012' | tr 'u' '_' > $1.dat
...

The problem with this is that Iam IOing like a madd man. The script is building 3 times as many files to make one usable output file. Also, input data may have l's and u's in it, causing unexpected results.
This is a question for all of you sed, awk and perl masters.
Is there a way get a flat ASCII file using the least middle steps? May be a oneliner (or two)?

Appreciate any ideas. Included is a tiny sample input file.

curt larson_1 · ‎02-16-2004

maybe this will work for you

fold -b -w756 $1 |
while read line
do
#everything but last char
x=${line%?}
lastChar=${line#$x}
if [[ $lastChar = "_" ]] ;then
#print line without the ending underscore
#and without a newline
print -nr $x
else
print -r $line
done > $1.dat

curt larson_1 · ‎02-16-2004

same thing using awk

fold -b -w756 $1 |
awk '{
x=length($0);
lastChar=substr($0,x,1);
b=substr($0,1,x-1);
if ( lastChar == "_" )
printf("%s",b);
else
printf("%s\n",$0);
}'

Victor Pavon · ‎02-18-2004

Thank you Curt, both solutions are a winner. My personal preference is for awk but as the great master said: "There is more than one way to skin a cat"
Thanks again.
Victor

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Data formating

Data formating

Re: Data formating

Re: Data formating

Re: Data formating