Operating System - Linux
1748032 Members
4751 Online
108757 Solutions
New Discussion юеВ

Re: counting the number of times a word appears in a file

 
SOLVED
Go to solution
Belinda Dermody
Super Advisor

counting the number of times a word appears in a file

I have a extracted log file and with the word smtp or SMTP appearing either once or multiple times per line, is there a way that we can count how many times smtp appears in the log, can't just count the lines because it appears more than once on certain lines...
20 REPLIES 20
Pete Randall
Outstanding Contributor

Re: counting the number of times a word appears in a file

grep -i smtp |wc -w

ought to do it.


Pete

Pete
Belinda Dermody
Super Advisor

Re: counting the number of times a word appears in a file

Sorry Pete, I tried that earlier, it grabs the lines with smtp and counts all the words in the line...
baiju_3
Esteemed Contributor

Re: counting the number of times a word appears in a file

If the word smtp|SMTP is having any delimiter , if so you can write a script which search for each word compare the word and then incriment a counter if the word is matched .


Thanks,
BL.

Good things Just Got better (Plz,not stolen from advertisement -:) )
Pete Randall
Outstanding Contributor

Re: counting the number of times a word appears in a file

James,

ARGHH! You're right of course. I remember a similar question a month or so ago but I can't find it at the moment and don't remember what the answer was.


Pete

Pete
Jeff Schussele
Honored Contributor

Re: counting the number of times a word appears in a file

HI James,

You can still use grep & wc -w
Save the grepped lines in a tmp file
Then just read through that file in a loop and increment a counter from the wc -w output.
Should work.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Belinda Dermody
Super Advisor

Re: counting the number of times a word appears in a file

There is no special delimiter and the fields are not fixed, there could be a % ; or even a white space before the smtp or SMTP.

Jeff I do not understand your response, even if I throw all the lines with either smtp | SMTP into an output file I will still have all the other words (mail stuff that would come up in the wc command.
curt larson_1
Honored Contributor

Re: counting the number of times a word appears in a file

a quick awk script

cat file | awk '{
for ( i=1; i<=NF; i++)
num[$i]++;
}
END {
for ( word in num )
print word, num[word];
}' | grep -i smtp
curt larson_1
Honored Contributor

Re: counting the number of times a word appears in a file

another method

cat file | tr "[:upper:]" "[:lower:]" |
tr -cs "[a-z0-9']" "\012" | sort |
uniq -c | sort +0nr +1d

convert all uppercase to lowercase
replace all characters not a-z0-9' with a new line. that means one word per line
sort because uniq expects sorted input
uniq counts the number of times each word appears then sort first from most to least frequent then alphabetically
Stephen Keane
Honored Contributor

Re: counting the number of times a word appears in a file

One way (though you'll have to change it to cope with uppercase SMTP to

exactly as typed, including the '\' !!


# sed -e 's/smtp/smtp\
/g' your_file | grep -i "smtp" | wc -l

maybe use tr in there to convert SMTP to smtp?

changes "smtp" into "smtp\n" so each smtp is on a separate line, then uses grep/wc to count them. Just a thought.