Re: Matching first line iwth a specific string

OFC_EDM · ‎03-10-2006

Mar 1
Mar 1
Mar 2
Mar 2

Above is a simplified version of the text I'm processing.

I want to do is surround the Month and Day with a tag "

" and "

" but only for the first occurrence.

Below is the command I've tried but it's putting the tags around ALL occurrences of "Mar 1"?

How do I make it just match the 1st occurence?

cat test|sed -e '1,$s/Mar 1/\

&\<\/h1\> /'

========== START OUTPUT======

Mar 1

Mar 2
Mar 2
========== END OUTPUT======

The Devil is in the detail.

James R. Ferguson · ‎03-10-2006

Hi Kevin:

# sed -e '1,1s/Mar 1/\

&\<\/h1\> /' file

Note that you don't have to spawn a separate process to first read the file (i.e. the 'cat' piped to 'sed').

Regards!

...JRF...

Carlos Roberto Schimidt · ‎03-10-2006

Dont use "$", see above

$cat test | sed -e '1s/Mar 1/\

&\<\/h1\> /'

Mar 1

Mar 1
Mar 2
Mar 2

Victor Fridyev · ‎03-10-2006

Hi,

For the exact input as in your msg, I'd try the following:

awk 'BEGIN {flag=""}
{if(flag!=$0){print "

",$0,"

";flag=$0}
else print $0}' inputfile

HTH

Entities are not to be multiplied beyond necessity - RTFM

Sandman! · ‎03-10-2006

Kevin,

How about sorting the input first and then piping the result to awk which uses associative arrays for parsing and replacing the string you want.

# sort -k1,2 inp | awk '{if(prev!=$1$2){print b$1,$2e;prev=$1$2}else print $0}' b="

" e="

"

cheers!

James R. Ferguson · ‎03-10-2006

Hi (again) Kevin:

It occurs to me that I took your post a bit too literally.

Should you want only the first matching line of *any* block, as for example to substitute about "Mar 2":

Mar 1
Mar 1

Mar 2

Mar 2
Mar 3

...then:

# sed -e 's/Mar 2/\

&\<\/h1\>/;n' file

This works for the original case too, but is appropriately general to the span of the entire file.

Regards!

...JRF...

OFC_EDM · ‎03-10-2006

I wasn't clear enough.

Basically I'm putting in Headers for each Day of a month from a log.

So using my previous example the output would be:

Mar 1

Mar 2

Mar 2
etc...for the rest of the days.

So really my pattern to match is
"Mar [ 0-9][0-9]" which would match Mar 1 to Mar 31.

Preceding the search with 1 (1s/Mar 1/)just matches the first line input and not the first occurence so that's not doing what I need.

Any more thoughts?

The Devil is in the detail.

OFC_EDM · ‎03-10-2006

Thanks James. I was surprised to find your answer posted prior to my attempted clarification.

The Devil is in the detail.

James R. Ferguson · ‎03-10-2006

Hi (again) Kevin:

Well, the 'sed' solution I posted worked for the particular repetition of lines you posted, but it *fails* for more generalized matching. I should have gone straight for Perl.

Consider a file that looks like:

Mar 1
Mar 1
Mar 1
Mar 2
Mar 2
Mar 2
Mar 2
Mar 3
Mar 4
the end at Mar 4 the end!

With the 'sed' solution:

# sed -e 's/Mar [0-9[0-9]/\

&\<\/h1\>/;n' file

...you get:

Mar 1

Mar 2

Mar 3

Mar 4

the end at Mar 4 the end!

...which is not much use.

*Instead* use:

# perl -ple chomp;unless ($prev=~$_) {s%(.*)(Mar \d+)(.*)%$1

$2

$3%};$prev=$_' file

...and you can have:

Mar 1

Mar 1
Mar 1

Mar 2

Mar 2
Mar 2
Mar 2

Mar 3

Mar 4

the end at

Mar 4

the end!

Regards!

...JRF...

OFC_EDM · ‎03-13-2006

Hi JRF

I get a `(' unexpected error when running that line of perl.

Is there a ( missing?

The Devil is in the detail.

OFC_EDM · ‎03-13-2006

JRF

I have text after the "Mar 1" or "Mar 22" etc.

When I run the perl against it the

and

is put around the Date but for EVERY line. I just want the header for the first occurence of each Date.

I'm stumped. You prev sed example worked on my test data but not when I plug it into my script.

I'll just keep pluggin away until this works.

Example:

Mar 13

06:53:28 hpkju gconfd (root-28106): starting (version 1.0.9), pid 28106 user 'root'

Mar 13

06:53:27 hpkju gnome-name-server[28098]: starting

The Devil is in the detail.

James R. Ferguson · ‎03-13-2006

Hi Kevin:

It looks like the paste failed. Try this:

#!/usr/bin/perl
while (<>) {
chomp;
unless ($prev =~ $_) {
s%(Mar \d+)%

$1

%;
}
$prev = $_;
print "$_\n";
}

Regards!

...JRF...

James R. Ferguson · ‎03-13-2006

Hi (again) Kevin:

# perl -ple 'chomp;unless ($prev=~$_) {s%(.*)(Mar\s+\d+)(.*)%$1

$2

$3%};$prev=$_' file

...the first single quote was simply dropped.

I have added, both for clarity, and better matching, a whitespace (\s) one or more times after the "Mar" is the regular expression. You could change the prior post too.

Regards!

...JRF...

Hein van den Heuvel · ‎03-13-2006

This 'one liner' works for me under a Windows CMD shell.
It does nto look just for "Mar" but defines a date as a line starting with an Upppercase, two lowercase, some whitespace and some decimals:

# type tmp.txt
Mar 1 aap
Mar 1 noot
Mar 1 mies
Mar 2 teun
Mar 2 vuur
Apr 2
Apr 2
Apr 3
Apr 4

# perl -pe "if (/^([A-Z][a-z][a-z]\s+\d+)/ && $1 ne $last) { $last=$1; $x='

'.$1.'<\h1>'; s/$1/$x/}" tmp.txt

Mar 1<\h1> aap
Mar 1 noot
Mar 1 mies

Mar 2<\h1> teun
Mar 2 vuur

Apr 2<\h1>
Apr 2

Apr 3<\h1>

Apr 4<\h1>

You'll have to play with the quotes to make it work for hpux

Or stick it in a script:

--------- tmp.pl --------------
if (/^([A-Z][a-z][a-z]\s+\d+)/ && $1 ne $last)
{
$last=$1;
$x='

'.$1.'<\h1>';
s/$1/$x/;
}

----------------
perl -p tmp.pl tmp.txt

Enjoy...

Hein.

James R. Ferguson · ‎03-13-2006

Hi Kevin:

Maybe I finally understand your requirement. Try this with your data.

# cat perl.pl
#!/usr/bin/perl
use strict;
use warnings;
my $month = "Mar";
my ($curr, $prev) = (".", ".");
while (<>) {
chomp;
$curr = $2 if $_ =~ m%(.*)(\b$month\s+\d+\b)(.*)%;
unless ($curr eq $prev) {
s%(.*)(\b$month\s+\d+\b)(.*)%

$1$2$3

%;
$prev = $2;
}
print "$_\n";
}

...run as:

# ./perl.pl file

Regards!

...JRF...

James R. Ferguson · ‎03-13-2006

Hi Kevin:

Here's a more robust variation for your use.

# cat perl.pl
#!/usr/bin/perl
use strict;
use warnings;
my $mon = shift;
my $file = shift or die "Usage: $0 monthname file\n";
open(FH, "<", "$file") or die "Can't open $file: $!\n";
my $patt = qr"(.*)(\b(?i)$mon\s+\d+\b)(.*)";
my ($curr, $prev) = (".", ".");
while () {
chomp;
$curr = $2 if m%$patt%;
s%$patt%

$1$2$3

% unless ($curr eq $prev);
$prev = $2 if defined($2);
print "$_\n";
}
1;

Run as:

# ./perl.pl monthname file

...for example:

# ./perl.pl mar mylog

Note that the abbreviated month name will be matched case-insenitively.

Regards!

...JRF...

Sandman! · ‎03-13-2006

Hi Kevin,

I don't know if it'll help at this stage since you may be pursuing a different solution altogether but this awk construct does what you are looking for i.e. encapsulate the header date in the input file.

=================myawk.cmd===================
{
if (prev!=$1$2) {
printf("%s %s",b$1,($2<10?" "$2e:$2e))
for (i=3;i<=NF;++i)
printf(" %s",$i)
printf("\n")
prev=$1$2
} else
print $0
}
=============================================

Create a file named "myawk.cmd" with the commands above and execute as:

# awk -f myawk.cmd b="

" e="

" infile

where "b" and "e" are variables containing the characters you want to surround the header date with.

cheers!

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Matching first line iwth a specific string

Matching first line iwth a specific string

" and "

&\<\/h1\> /'========== START OUTPUT======

Mar 1

Mar 1