1848762 Members
7313 Online
104036 Solutions
New Discussion

sed, ed, (hurts my head)

 
Ron Bombard
Frequent Advisor

sed, ed, (hurts my head)

Greetings!

I have a large text file and was wondering what the easiest way would be to replace "whitespace" between fields (words) with "tabs". is this possible? Each field in the file starts at a specific character position, but was hoping there was an easy way of just replacing whitespace with a tab.

Ideas? Suggestions? Flames?
Thanks.
Meddle not in the affairs of dragons... for you are crunchy and taste like chicken.
14 REPLIES 14
Jean-Luc Oudart
Honored Contributor

Re: sed, ed, (hurts my head)

Well not sed but tr may do the trick :
cat | tr " " "\t" >

ET voila

Jean-Luc
fiat lux
Pete Randall
Outstanding Contributor

Re: sed, ed, (hurts my head)

Ron,

Here's a couple of handy references to sed and awk commands, originally (to me, at least) courtesy of Princess Paula.

Pete

Pete
Pete Randall
Outstanding Contributor

Re: sed, ed, (hurts my head)

And here's the awk one.

Pete

Pete
Thierry Poels_1
Honored Contributor

Re: sed, ed, (hurts my head)

Hi,

how about "unexpand" ???
see "man unexpand"

good luck,
Thierry.
All unix flavours are exactly the same . . . . . . . . . . for end users anyway.
Ron Bombard
Frequent Advisor

Re: sed, ed, (hurts my head)

I think I should have thought this through before posting... The "unexpand" program works good, and the "tr" suggestions also works. But.... all it takes is a blank field to screw it up. If it runs into a blank field, instead of 2 tabs, there will be one.

I'm trying to load a text file into MySQL, and it has to be tab delimited. And the original text file's field delimination is by ASCII Character Position.

Me thinks this is going to be complicated ;)
Meddle not in the affairs of dragons... for you are crunchy and taste like chicken.
Hai Nguyen_1
Honored Contributor

Re: sed, ed, (hurts my head)

Ron,

Give us one line before and after the format. Then we may help you better.

Hai
Tom Maloy
Respected Contributor

Re: sed, ed, (hurts my head)

If I understand you correctly, you want to replace each single blank with one tab character.

cat file | sed -e "s/ /\t/g"

Note that \t should be replaced by the tab character.

The "g" does the global replace.

Tom
Carpe diem!
Ron Bombard
Frequent Advisor

Re: sed, ed, (hurts my head)

Ok, here's the before file. I have 2 lines in it to show what happens when a field is missing. The first line is a complete line.
Meddle not in the affairs of dragons... for you are crunchy and taste like chicken.
Ron Bombard
Frequent Advisor

Re: sed, ed, (hurts my head)

Ok, here's the after unexpand file. I have 2 lines in it to show what happens when a field is missing. The first line is a complete line.
Meddle not in the affairs of dragons... for you are crunchy and taste like chicken.
Ron Bombard
Frequent Advisor

Re: sed, ed, (hurts my head)

Problem with "unexpand" is... there has to be more than 2 spaces together to convert to tab. Some of my fields are only 2 space apart. So, I ran the "unexpand-ed" file thru a "tr" filter and here's what I got
Meddle not in the affairs of dragons... for you are crunchy and taste like chicken.
Vincent Fleming
Honored Contributor

Re: sed, ed, (hurts my head)

I think it might be easier for you with "cut", which is good at getting things at character positions...

while read LINE
do
firstfield=`echo $LINE | cut -c 2-10`
secondfield=`echo $LINE | cur -c 11-20`

printf( "%s\t%s\n", $firstfield, $secondfield)

done < inputfile


Make a "firstfield" definition for each field in the record.

Personally, I think this would probably be easier to write in C (certainly much faster to execute), but I'm guessing you don't know C.

Good luck!
No matter where you go, there you are.
Ron Bombard
Frequent Advisor

Re: sed, ed, (hurts my head)

Hey thanks for the "cut" idea. I figured I have to do something like that. Now... the "printf", I've never played with.

Here's my script:
=======================

while read LINE
do
time=`echo $LINE | cut -c 3-6`
duration=`echo $LINE | cut -c 8-11`
conncode=`echo $LINE | cut -c 12-13`
acccode=`echo $LINE | cut -c 15-17`
acccode2=`echo $LINE | cut -c 19-21`
dialed=`echo $INE | cut -c 23-37`
originate=`echo $LINE | cut -c 39-43`

printf( "%s\t%s\n", $time, $duration, $conncode, $acccode, $acccode2, $dialed, $originate)

done < inputfile
=========================

I ran piped the text file through this:

cat txtfile | converterprg

I get this error:

./converter: line 11: syntax error near unexpected token `"%s\t%s\n",'
./converter: line 11: `printf( "%s\t%s\n", $time, $duration, $conncode, $acccode, $acccode2, $dialed, $originate)'
I'm not sure what this means...
Thanks!
Meddle not in the affairs of dragons... for you are crunchy and taste like chicken.
Chris Lonergan
Advisor

Re: sed, ed, (hurts my head)

Use sed or ed it doesn't matter. They both have tagged regular expressions and can slpit a line thus:-

s/\(...\) *\(....\) *\(.....\)/\1\t\2\t\3\t/

This will take three fields of 3,4 and 5 characters respectively, delimited by any number of spaces (including none) and format the same three fields but seperated by a 'tab' character.

i.e. \(...\) takes the first 3 characters and \1 places these in the output.

Change the ... to denote the characters that you require (for many characters use the notation .{m} where m is the number of characters. For whitespace use ['space'tab']* instead of space* or spacespace* which must match at least one space
Vincent Fleming
Honored Contributor

Re: sed, ed, (hurts my head)

Sorry - my syntax for the printf was incorrect. It's been a while. I gave the syntax for doing that in awk.

Anyway, with these variables you defined:

time=`echo $LINE | cut -c 3-6`
duration=`echo $LINE | cut -c 8-11`
conncode=`echo $LINE | cut -c 12-13`
acccode=`echo $LINE | cut -c 15-17`
acccode2=`echo $LINE | cut -c 19-21`
dialed=`echo $INE | cut -c 23-37`
originate=`echo $LINE | cut -c 39-43`

You should use this syntax (in ksh):

printf "%s\t%s\t%s\t%s\t%s\t%s\t%s\n" "$time" "$duration" "$conncode" "$acccode" "$acccode2" "$dialed" "$originate"

Good luck!
No matter where you go, there you are.