count occurrence of a regex in a very long line

Mike_Ca Li · ‎11-21-2005

I have a very long line which has about 10 thousand words and many words with "regexpression" how to use a script command to get the count for "regexpression"? Note that one way is break the long line, by using space as separator, into lots of shorter line but that is not efficient. Thank you.

eg: quick brown fox jumps on the lazy dog quick brown fox jumps on the lazy dog quick brown fox jumps on the lazy dog quick brown fox jumps on the lazy dog quick brown fox jumps on the lazy dog quick brown fox jumps on the lazy dog quick brown fox jumps on the lazy dog quick brown fox jumps on the lazy dog quick brown fox jumps on the lazy dog
I need the count for "jumps" for instance.

Steven E. Protter · ‎11-21-2005

Shalom Mike,

Perhaps process the string with awk?

echo $string | awk -F '{print $1 ... }'

Maybe someting with xargs, not sure.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Steven E. Protter · ‎11-21-2005

Mike,

Ah, forget that stuff..... in my first post, unless it unexpected helpful.

store the data in a file:

grep jumps filename | wc -l

Gets you count of jumps.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Sandman! · ‎11-21-2005

How about a short one-line awk construct like...

# awk -F"jumps" '{cnt+=(NF-1)} END{print cnt}'

regards!

Mike_Ca Li · ‎11-21-2005

Thanks for reply SEP.
I tried
awk -F" " '{print NF}' filename
awk: record ... too long. Any other ideas?

Vincent Fleming · ‎11-21-2005

Steven,

Having a bad day? (this isn't like you!)

grep jumps filename | wc -l

will tell you how many lines "jumps" appears on, not how many times it appears in one line.

I can't think of an elegent way of doing this, given the regular expression requirement... UNIX shell tools are all pretty much line-based.

You could use 'sed' to change spaces to \r (ie: so that each word is on it's own line), then pipe it through grep|wc or awk... or something similar, but that'll work only if the regex you're looking for does not contain spaces.

Of course, there's always a C program - that's ALWAYS elegent!

Regards,

Vince

No matter where you go, there you are.

James R. Ferguson · ‎11-21-2005

Hi Mike:

# perl -lne '$i++ while m/jumps/g;END{print $i}'

Regards!

...JRF...

Vincent Fleming · ‎11-21-2005

Better idea, use "tr" instead of "sed"...

oh - and I also meant \n, not \r...

echo "whatever" | tr ' ' '\n' | grep jumps | wc -l

"tr" is smaller and more efficient than sed since it does so little...

-Vince

No matter where you go, there you are.

James R. Ferguson · ‎11-21-2005

Hi (again) Mike:

I might add (of course) that you can pipe input to the perl code or specify the filename as an argument:

# perl -lne '$i++ while m/jumps/g;END{print $i}' filename

# echo ... | # perl -lne '$i++ while m/jumps/g;END{print $i}'

Regards!

...JRF...

Hein van den Heuvel · ‎11-21-2005

An other perl way readily allowing for a regexpr:

perl -ne 'print scalar split (/jumps/)."\n"' filename

Of course this prints one too many.

So fix to:

perl -ne '$x=scalar split(/jumps/) -1; print "$x\n"' filename

Or for single count of many words on many lines:

perl -ne '$x+=scalar split(/jumps/) -1;END{print "$x\n"}' filename

Hein.

RAC_1 · ‎11-21-2005

No perl, no awk, just plain Os built in commands.

Are the words on line seperated by a space?? If yes, do as follows.

tr " " "\n" < your_file | grep -ic "your_reg_exp"

There is no substitute to HARDWORK

Mike_Ca Li · ‎11-22-2005

Thanks a lot for all the commands and suggestions.
I tested all the perl and OS commands. Plain OS command run faster

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

count occurrence of a regex in a very long line

count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line

Re: count occurrence of a regex in a very long line