Operating System - HP-UX
1846585 Members
2360 Online
110256 Solutions
New Discussion

Re: char countiing script help...

 
SOLVED
Go to solution
Manuel Contreras
Regular Advisor

char countiing script help...


I have a testFILE that with say 100 lines.

how would I be able to provide the number of char. or words for each line in a file?

like wc, but would be open to awk,sed,etc...

number of words:
$ head -1 testFILE | tail -1 | wc -w
5

number of characters:
$ head -1 testFILE | tail -1 | wc -c
90

any assistance is great appreciated,
manuel contreras
8 REPLIES 8
Mel Burslan
Honored Contributor
Solution

Re: char countiing script help...

cat file | while read line
do
chars=`echo $line | wc -c`
words=`echo $line | wc -w`
printf $chars:$words:; echo $line
done

this will count the characters and words on each line and present them like this

41:8:quick brown fox jumped over the lazy dog


hope this is what you are looking for
________________________________
UNIX because I majored in cryptology...
curt larson_1
Honored Contributor

Re: char countiing script help...

awk '{
printf("%s:%s %s\n",length($0)+1,NF,$0);
}' yourfile
Hein van den Heuvel
Honored Contributor

Re: char countiing script help...



perl -pe 'print split.":".length.":"' /etc/hosts

awk '{print NF ":" length ":" $0}' /etc/hosts
H.Merijn Brand (procura
Honored Contributor

Re: char countiing script help...

Hein, that perl snippet is a nice thought, but not OK. You use split in list context, so it returns all characters in the line.

`Words' is an arguable definition. Where do you want to split on? Is "user:xhdffw2:203:200" 1 word, or are there 7 words, or 4

If it is 7, use

# perl -nle 'print"$.:",scalar(split/\b/),":".length' /etc/hosts

which will split on word bounds and print the line number, the number od `words', and the number of characters (including white space) for each line.

If `words' are split on white space, like in text files ("which," being ONE words, including the comma), you could use

# perl -nle 'print"$.:",scalar(split/\s+/),":".length' /etc/hosts

If you dont want to count the white space in the character count,

# perl -nle '$"="";@x=split/\s+/;print"$.:",scalar@x,":".length"@x"' /etc/hosts

As you can see, the possibilities are endless. It all depends on /your/ defenition of the truth ...

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
H.Merijn Brand (procura
Honored Contributor

Re: char countiing script help...

Sorry, Hein, I misread the "." for a "," (/me blames too small font)

No points for this one please.

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Muthukumar_5
Honored Contributor

Re: char countiing script help...

Hai,

Do you want to have command as like wc. We can do it with awk command as like as ,

cat filename | awk '{ print "line: " NR " words: " NR " chars: "length($0) }'

It will print line number, number of words and length of the line.

Regards,
Muthukumar.
Easy to suggest when don't know about the problem!
Manuel Contreras
Regular Advisor

Re: char countiing script help...

this may be a dumb question, but why do some of the commands recognize/display different
number of chars in a line (there is no space after the last char)?

(this command does not count blank spaces)
$ cat man99 | while read line
> do
> chars=`echo $line | wc -c`
> words=`echo $line | wc -w`
> printf $chars:$words:; echo $line
> done | head -5
90:1:12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
66:5:A20040601FWL1085 000000000.095940 000306275.000000FG1085 A0406 EA
66:5:A20040601FWS1085 000000000.077005 000308043.000000FG1085 A0406 EA
67:5:A20040601FWV05600 000000000.008854 000308043.000000FG1085 A0406 EA
67:5:A20040601FWV07019 000000000.679378 000004611.172617FG1085 A0406 LB

(this command seems to count 90 char. on first line)
$ awk '{
> printf("%s:%s %s\n",length($0)+1,NF,$0);
> }' man99 | head -5
90:1 12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
90:5 A20040601FWL1085 000000000.095940 000306275.000000FG1085 A0406 EA
90:5 A20040601FWS1085 000000000.077005 000308043.000000FG1085 A0406 EA
90:5 A20040601FWV05600 000000000.008854 000308043.000000FG1085 A0406 EA
90:5 A20040601FWV07019 000000000.679378 000004611.172617FG1085 A0406 LB

(this one also recognizes 90 char. on first line)
$ perl -pe 'print split.":".length,":"' man99 | head -5
1:90:12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
5:90:A20040601FWL1085 000000000.095940 000306275.000000FG1085 A0406 EA
5:90:A20040601FWS1085 000000000.077005 000308043.000000FG1085 A0406 EA
5:90:A20040601FWV05600 000000000.008854 000308043.000000FG1085 A0406 EA
5:90:A20040601FWV07019 000000000.679378 000004611.172617FG1085 A0406 LB

(this one works correctly)
$ awk '{print NF ":" length ":" $0}' man99 | head -5
1:89:12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
5:89:A20040601FWL1085 000000000.095940 000306275.000000FG1085 A0406 EA
5:89:A20040601FWS1085 000000000.077005 000308043.000000FG1085 A0406 EA
5:89:A20040601FWV05600 000000000.008854 000308043.000000FG1085 A0406 EA
5:89:A20040601FWV07019 000000000.679378 000004611.172617FG1085 A0406 LB

(as well as this one)
$ cat man99 | awk '{ print "line: " NR " words: " NR " chars: "length($0) }' | head -5
line: 1 words: 1 chars: 89
line: 2 words: 2 chars: 89
line: 3 words: 3 chars: 89
line: 4 words: 4 chars: 89
line: 5 words: 5 chars: 89





thank you all...I appreciate the education,

manuel contreras
Muthukumar_5
Honored Contributor

Re: char countiing script help...

Hai,

Use the format of
cat filename | awk '{ print "line: " NR " words: " NF " chars: "length($0) }'

NF for NR in the words list.

You will see a difference between the wc and awk command for computing the characters.

echo "test" | wc -c will give the 5 for characters count.

You can see the contents of string with more descriptive manner with od command as,

echo "test" | od -dc as
0000000 24936 2665
h a i \n
0000004

awk is working by taking the space,tabs and newlines characters for field separtor (fs).

If you do,

echo "test" | awk '{ length($0) }'
will give only 4 for string length.

It won't take the \n in that. Because it is used to separate the fields on the awk call.

Regards,
Muthukumar.
Easy to suggest when don't know about the problem!