Operating System - Linux
1828045 Members
1828 Online
109973 Solutions
New Discussion

Re: Matching first line iwth a specific string

 
SOLVED
Go to solution
OFC_EDM
Respected Contributor

Matching first line iwth a specific string

Mar 1
Mar 1
Mar 2
Mar 2

Above is a simplified version of the text I'm processing.

I want to do is surround the Month and Day with a tag "

" and "

" but only for the first occurrence.

Below is the command I've tried but it's putting the tags around ALL occurrences of "Mar 1"?

How do I make it just match the 1st occurence?

cat test|sed -e '1,$s/Mar 1/\

&\<\/h1\> /'

========== START OUTPUT======

Mar 1


Mar 1


Mar 2
Mar 2
========== END OUTPUT======
The Devil is in the detail.
16 REPLIES 16
James R. Ferguson
Acclaimed Contributor

Re: Matching first line iwth a specific string

Hi Kevin:

# sed -e '1,1s/Mar 1/\

&\<\/h1\> /' file

Note that you don't have to spawn a separate process to first read the file (i.e. the 'cat' piped to 'sed').

Regards!

...JRF...

Carlos Roberto Schimidt
Regular Advisor

Re: Matching first line iwth a specific string

Dont use "$", see above

$cat test | sed -e '1s/Mar 1/\

&\<\/h1\> /'


Mar 1


Mar 1
Mar 2
Mar 2
Victor Fridyev
Honored Contributor

Re: Matching first line iwth a specific string

Hi,

For the exact input as in your msg, I'd try the following:

awk 'BEGIN {flag=""}
{if(flag!=$0){print "

",$0,"

";flag=$0}
else print $0}' inputfile

HTH

Entities are not to be multiplied beyond necessity - RTFM
Sandman!
Honored Contributor

Re: Matching first line iwth a specific string

Kevin,

How about sorting the input first and then piping the result to awk which uses associative arrays for parsing and replacing the string you want.

# sort -k1,2 inp | awk '{if(prev!=$1$2){print b$1,$2e;prev=$1$2}else print $0}' b="

" e="

"

cheers!
James R. Ferguson
Acclaimed Contributor
Solution

Re: Matching first line iwth a specific string

Hi (again) Kevin:

It occurs to me that I took your post a bit too literally.

Should you want only the first matching line of *any* block, as for example to substitute about "Mar 2":

Mar 1
Mar 1

Mar 2


Mar 2
Mar 3

...then:

# sed -e 's/Mar 2/\

&\<\/h1\>/;n' file

This works for the original case too, but is appropriately general to the span of the entire file.

Regards!

...JRF...

OFC_EDM
Respected Contributor

Re: Matching first line iwth a specific string

I wasn't clear enough.

Basically I'm putting in Headers for each Day of a month from a log.

So using my previous example the output would be:

Mar 1


Mar 1

Mar 2


Mar 2
etc...for the rest of the days.

So really my pattern to match is
"Mar [ 0-9][0-9]" which would match Mar 1 to Mar 31.

Preceding the search with 1 (1s/Mar 1/)just matches the first line input and not the first occurence so that's not doing what I need.

Any more thoughts?
The Devil is in the detail.
OFC_EDM
Respected Contributor

Re: Matching first line iwth a specific string

Thanks James. I was surprised to find your answer posted prior to my attempted clarification.
The Devil is in the detail.
James R. Ferguson
Acclaimed Contributor

Re: Matching first line iwth a specific string

Hi (again) Kevin:

Well, the 'sed' solution I posted worked for the particular repetition of lines you posted, but it *fails* for more generalized matching. I should have gone straight for Perl.

Consider a file that looks like:

Mar 1
Mar 1
Mar 1
Mar 2
Mar 2
Mar 2
Mar 2
Mar 3
Mar 4
the end at Mar 4 the end!

With the 'sed' solution:

# sed -e 's/Mar [0-9[0-9]/\

&\<\/h1\>/;n' file

...you get:

Mar 1


Mar 1

Mar 1


Mar 2

Mar 2


Mar 2

Mar 2


Mar 3

Mar 4


the end at Mar 4 the end!

...which is not much use.

*Instead* use:

# perl -ple chomp;unless ($prev=~$_) {s%(.*)(Mar \d+)(.*)%$1

$2

$3%};$prev=$_' file

...and you can have:

Mar 1


Mar 1
Mar 1

Mar 2


Mar 2
Mar 2
Mar 2

Mar 3


Mar 4


the end at

Mar 4

the end!

Regards!

...JRF...
OFC_EDM
Respected Contributor

Re: Matching first line iwth a specific string

Hi JRF

I get a `(' unexpected error when running that line of perl.

Is there a ( missing?

The Devil is in the detail.
OFC_EDM
Respected Contributor

Re: Matching first line iwth a specific string

JRF

I have text after the "Mar 1" or "Mar 22" etc.

When I run the perl against it the

and

is put around the Date but for EVERY line. I just want the header for the first occurence of each Date.

I'm stumped. You prev sed example worked on my test data but not when I plug it into my script.

I'll just keep pluggin away until this works.

Example:

Mar 13

06:53:28 hpkju gconfd (root-28106): starting (version 1.0.9), pid 28106 user 'root'

Mar 13

06:53:27 hpkju gnome-name-server[28098]: starting

The Devil is in the detail.
James R. Ferguson
Acclaimed Contributor

Re: Matching first line iwth a specific string

Hi Kevin:

It looks like the paste failed. Try this:

#!/usr/bin/perl
while (<>) {
chomp;
unless ($prev =~ $_) {
s%(Mar \d+)%

$1

%;
}
$prev = $_;
print "$_\n";
}

Regards!

...JRF...
James R. Ferguson
Acclaimed Contributor

Re: Matching first line iwth a specific string

Hi (again) Kevin:

# perl -ple 'chomp;unless ($prev=~$_) {s%(.*)(Mar\s+\d+)(.*)%$1

$2

$3%};$prev=$_' file

...the first single quote was simply dropped.

I have added, both for clarity, and better matching, a whitespace (\s) one or more times after the "Mar" is the regular expression. You could change the prior post too.

Regards!

...JRF...
Hein van den Heuvel
Honored Contributor

Re: Matching first line iwth a specific string

This 'one liner' works for me under a Windows CMD shell.
It does nto look just for "Mar" but defines a date as a line starting with an Upppercase, two lowercase, some whitespace and some decimals:

# type tmp.txt
Mar 1 aap
Mar 1 noot
Mar 1 mies
Mar 2 teun
Mar 2 vuur
Apr 2
Apr 2
Apr 3
Apr 4

# perl -pe "if (/^([A-Z][a-z][a-z]\s+\d+)/ && $1 ne $last) { $last=$1; $x='

'.$1.'<\h1>'; s/$1/$x/}" tmp.txt

Mar 1<\h1> aap
Mar 1 noot
Mar 1 mies

Mar 2<\h1> teun
Mar 2 vuur

Apr 2<\h1>
Apr 2

Apr 3<\h1>

Apr 4<\h1>

You'll have to play with the quotes to make it work for hpux

Or stick it in a script:

--------- tmp.pl --------------
if (/^([A-Z][a-z][a-z]\s+\d+)/ && $1 ne $last)
{
$last=$1;
$x='

'.$1.'<\h1>';
s/$1/$x/;
}

----------------
perl -p tmp.pl tmp.txt


Enjoy...

Hein.

James R. Ferguson
Acclaimed Contributor

Re: Matching first line iwth a specific string

Hi Kevin:

Maybe I finally understand your requirement. Try this with your data.

# cat perl.pl
#!/usr/bin/perl
use strict;
use warnings;
my $month = "Mar";
my ($curr, $prev) = (".", ".");
while (<>) {
chomp;
$curr = $2 if $_ =~ m%(.*)(\b$month\s+\d+\b)(.*)%;
unless ($curr eq $prev) {
s%(.*)(\b$month\s+\d+\b)(.*)%

$1$2$3

%;
$prev = $2;
}
print "$_\n";
}

...run as:

# ./perl.pl file

Regards!

...JRF...
James R. Ferguson
Acclaimed Contributor

Re: Matching first line iwth a specific string

Hi Kevin:

Here's a more robust variation for your use.

# cat perl.pl
#!/usr/bin/perl
use strict;
use warnings;
my $mon = shift;
my $file = shift or die "Usage: $0 monthname file\n";
open(FH, "<", "$file") or die "Can't open $file: $!\n";
my $patt = qr"(.*)(\b(?i)$mon\s+\d+\b)(.*)";
my ($curr, $prev) = (".", ".");
while () {
chomp;
$curr = $2 if m%$patt%;
s%$patt%

$1$2$3

% unless ($curr eq $prev);
$prev = $2 if defined($2);
print "$_\n";
}
1;

Run as:

# ./perl.pl monthname file

...for example:

# ./perl.pl mar mylog

Note that the abbreviated month name will be matched case-insenitively.

Regards!

...JRF...
Sandman!
Honored Contributor

Re: Matching first line iwth a specific string

Hi Kevin,

I don't know if it'll help at this stage since you may be pursuing a different solution altogether but this awk construct does what you are looking for i.e. encapsulate the header date in the input file.

=================myawk.cmd===================
{
if (prev!=$1$2) {
printf("%s %s",b$1,($2<10?" "$2e:$2e))
for (i=3;i<=NF;++i)
printf(" %s",$i)
printf("\n")
prev=$1$2
} else
print $0
}
=============================================

Create a file named "myawk.cmd" with the commands above and execute as:

# awk -f myawk.cmd b="

" e="

" infile

where "b" and "e" are variables containing the characters you want to surround the header date with.

cheers!