Simpler Navigation coming for Servers and Operating Systems
Coming soon: a much simpler Servers and Operating Systems section of the Community. We will combine many of the older boards, and you won't have to click through so many levels to get at the information you need. If you are looking for an older board and do not find it, check the consolidated boards, as the posts are still there.
cancel
Showing results for 
Search instead for 
Did you mean: 

PERL pattern matching

Highlighted
Pat Tom
Occasional Contributor

PERL pattern matching

I have a flat file which contains hundreds of patterns. I need to write a PERL script that parses this file and returns each unique pattern and the count of its occurance.

I am trying it for quite sometime but not able to code it.

Can anyone please help.

Thanks
Pat
3 REPLIES
H.Merijn Brand (procura
Honored Contributor

Re: PERL pattern matching

#!/opt/perl/bin/perl

use strict;
use warnings;

my %pat;
while (<>) {
$pat{$_}++;
}
foreach my $pat (sort keys %pat) {
printf "%6d %s", $pat{$pat}, $pat;
}

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Dennis Handly
Acclaimed Contributor

Re: PERL pattern matching

What do you mean by patterns? A file can have tokens or words but RE patterns is something only a human can find. After all, one pattern can be made for all English words.

If you want to search for space delimited alphabetic tokens you can use:
$ tr -cs "[A-Z][a-z]" "[\012*]" < file | sort | uniq -c
James R. Ferguson
Acclaimed Contributor

Re: PERL pattern matching

Hi Pat:

Merijn's script gives you the solution based on the contents of a *line*. If your lines contain multiple "words" this variation gives you their unique counts.

#!/usr/bin/perl
use strict;
use warnings;

my %pat;
while (<>) {
my @a=m{\w+}g;
$pat{$_}++ for (@a);
}
foreach my $pat (sort keys %pat) {
printf "%6d %s\n", $pat{$pat}, $pat;
}

The \w regular expression matches an alphanumeric character or underscore but not a hyphen, quote, comma, semicolon, colons, etc.

As for writing the word "PERL" -- don't -- we are speaking of the "Perl" language:

http://www.perl.org/about/style-guide.html

You have previously posted another question about pattern matching for which solutions were provided. You forgot to score those solutions. It would be appreciated:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1100015

Regards!

...JRF...