shell script problem

Landen · ‎07-20-2001

I was wondering how to write a script that accepts a pattern and filename as arguments and then counts the number of occurrences of the pattern in the file (A pattern may occur more than once in a line and comprise only alphanumeric characters and underscore).

Any assistance would be appreciated!!

Arrivederci ...

Kevin Wright · ‎07-20-2001

you could do something like
grep $1 $2 | wc -l

Charles McCary · ‎07-20-2001

grep -c pattern filename

Sachin Patel · ‎07-20-2001

Hi
Do you want to write in perl?

#!/usr/local/bin/perl
$filename=shift; #first argument is filename
$pattern=shift; #second argument is pattern

open (FILE,"$filename") ||die "can't open";

while ()
{
if(/$pattern/)
{
count++; #put your logic here
}
}

This will get you started.

Sachin

Is photography a hobby or another way to spend $

A. Clay Stephenson · ‎07-20-2001

Hi,

This is a fairly interesting one in that you have to exemine each line for possible multiple matches. My solution is to call awk as a 'here doc' and use the gsub function to substitute the target string with a nonsense string. gsub returns the number of substitutions and hence the number of pattern matches. We dont save the altered line so no harm is done.

Usage: count.sh target file1 file2 ...
will list each file and the count of matches
or count.sh target < stdin will read stdin and count the number of pattern matches.

Enjoy, Clay

If it ain't broke, I can fix that.

Curtis Larson_1 · ‎07-20-2001

here is a script from Unix power tools by O'reilly. I'm sure you can adapt it to your situation. Basically, do tr to put each word on a separate line, then count the lines. something like:

cat your_file | tr -cs "[:alnum:]_" "[\012*]" | grep -c your_pattern

#! /bin/sh
### wordfreq - count number of occurrences of each word in input
### Usage: wordfreq [-i] [files]
#
# ** CONFIGURATION NOTE **: See comments above second "tr" command below
#
## wordfreq counts the number of occurrences of each word in its input.
## If you give it files, it reads from them; otherwise it reads stdin.
## The -i option folds upper case into lower case (capitalized letters
## will count the same as lower-case).
#
# Adapted from "concordance", which Carl Brandauer posted to USENET.

# Different versions are a pain... :-(
case "$1" in
-i) shift
tr1="[a-z]"
tr2bsd="a-z'" tr2sys5="[a-z]'"
;;
*) # no case conversion
tr1="[A-Z]"
tr2bsd="A-Za-z'" tr2sys5="[A-Z][a-z]'"
;;
esac

cat ${1+"$@"} | # Work around problem with "$@" in some shells
tr "[A-Z]" "$tr1" | # Convert upper case to lower if -i option
#
# NOTE: If you use Berkeley tr(1), comment out the second tr command and
# uncomment the first tr command:
#
#tr -cs "$tr2bsd" "\012" |
tr -cs "$tr2sys5" "[\012*]" | # Replace all characters not a-z or ' with
# a new line. i.e. one word per line
sort | # uniq expects sorted input
uniq -c | # Count the number of times each word appears
sort +0nr +1d # Sort first from most to least frequent,
# then alphabetically

James R. Ferguson · ‎07-20-2001

Hi:

Very interesting problem! Here's a small, unembellished script which returns the count of the matched pattern.

#!/usr/bin/sh
typeset P=$1
typeset F=$2
R=`awk -v P=$P '{for (i=1;i <=NF;i++) ary [$i]=1}
END{ for (S in ary) if (S~P) {k=k+1};print k}' $F`
echo $R
#_end.

Call the script "my.sh" and execute this:

# ./my.sh local /etc/hosts

...This will print "1" for having found the string "localhost" in /etc/hosts.

# ./my.sh lo /etc/hosts

...will print "3" since "lo" matched "loghost" on one line, and both "localhost" and "loopback" together on another line.

Regards!

...JRF...

A. Clay Stephenson · ‎07-20-2001

Hi James,

Nice solution. We probably should suggest that both of our scripts should actually feed awk with a grep command. Grep's generally faster and then awk would only need to look for the matched lines for multiple patterns.

Regards, Clay

If it ain't broke, I can fix that.

James R. Ferguson · ‎07-21-2001

Hi Clay:

Thanks. Your suggestion to use 'grep' to do the initial filtering is a nice touch. In my script's case, the array built for subsequent evaluation would be greatly reduced in size. I find that intrinsically appealing having grown up in environments where it was cheaper to invest programming time to gain performance than to "throw more hardware" at the problem.

Regards!

...JRF...

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

shell script problem

shell script problem

Re: shell script problem

Re: shell script problem

Re: shell script problem

Re: shell script problem

Re: shell script problem

Re: shell script problem

Re: shell script problem

Re: shell script problem