Operating System - HP-UX
1753587 Members
6655 Online
108796 Solutions
New Discussion юеВ

script to count the no. of a particular word

 
devhariprasad
Advisor

script to count the no. of a particular word

I am trying to write a script to count the no. of occurences of a word (say "swinstall") in the files in a particular directory.

I tried doing this using "wc" and "grep" commands but was unsuccessful.

Please help me out.
15 REPLIES 15
Steven E. Protter
Exalted Contributor

Re: script to count the no. of a particular word

Shalom,

Post the code you are using please.

grep swinstall * | wc -l

That will get you close.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Yogeeraj_1
Honored Contributor

Re: script to count the no. of a particular word

hi.

if you have one occurence per line,
you can also try:

awk '/swinstall/{n++}; END {print n+0}' *.txt

hope this helps too!
kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
devhariprasad
Advisor

Re: script to count the no. of a particular word

grep swinstall * | wc -l

This will not serve the purpose. Suppose if there are 3 words by name swinstall in a line that will be counted as one. I want those 3 words in a single line to the counted as 3.

Below is the code that I used

for i in *
do

count=`grep -c swinstall $i`

echo $i "has " $count "swinstalls"

done

This code also has the same problem as the code you have given
devhariprasad
Advisor

Re: script to count the no. of a particular word

Hi Yogeeraj,

Suppose the following are the contents of the file
---------------------------------------------
swinstall
swinstall use
i am called swinstall
swinstall is called swinstall everywhere else
---------------------------------------------

You awk script will count the no. of occurences of swinstall as 4, whereas the answer I am expecting is 5. The last line has 2 occurences of swinstall. This is counted as just one.
Wouter Jagers
Honored Contributor

Re: script to count the no. of a particular word

I would try this:

# sed -n 's/\(\\)/\n\1\n/gp' myfile | grep '^swinstall$' | wc -l

Cheers,
Wout
an engineer's aim in a discussion is not to persuade, but to clarify.
Wouter Jagers
Honored Contributor

Re: script to count the no. of a particular word

Sorry, seeing that line I think I should clarify :-)

The sed command will just look for any occurence of 'swinstall' and put each occurence on a line of its own (adding linebreaks before and after). Then, it becomes possible to grep for lines which hold only this word and count them.

Cheers,
Wout
an engineer's aim in a discussion is not to persuade, but to clarify.
Wouter Jagers
Honored Contributor

Re: script to count the no. of a particular word

Poured it into a little script to check a complete directory, as per your original question. Note that the directory should -only- contain text files ;-)

I guess it's a start, whatever your ultimate plan is.



#!/bin/sh
# Will count the number of occurences of MYSTRING in the files within MYDIR
#
MYDIR="/root/tmp"
MYSTRING="swinstall"

for i in `ls $MYDIR`
do
echo $i has `sed -n "s/\(\<${MYSTRING}\>\)/\n\1\n/gp" $i | grep "^${MYSTRING}$" | wc -l` occurences of $MYSTRING
done

exit 0
an engineer's aim in a discussion is not to persuade, but to clarify.
James R. Ferguson
Acclaimed Contributor

Re: script to count the no. of a particular word

Hi:

To count the number of occurancs of a pattern ("word") you can use:

# perl -lne 'BEGIN{$count=0};$count++ while (m/\bswinstall\b/ig);END{print $count}' file

This counts the total number of occurances of the string "swinstall" in 'file'.

You can total the count for any number of files by passing multilple file arguments.

Regards!

...JRF...
john korterman
Honored Contributor

Re: script to count the no. of a particular word

Hi,

try the attached script for a start: $1="string to search for", $2="file to search".
The count is faked, but it may produce the correct result.

regards,
John K.
it would be nice if you always got a second chance