1825770 Members
1987 Online
109687 Solutions
New Discussion

Re: Awk or Perl advice

 
SOLVED
Go to solution
Belinda Dermody
Super Advisor

Awk or Perl advice

I have a daily message file that is about 800mb in size and I have to do a search occassionaly for certain users. The 3 lines that I am looking for are HeaderFrom, HeaderTo and HeaderSubject. They are always in that sequence. What I need when the HeaderTo='jmarrion@bcharrispub.com' I need the previous line(HeaderFrom) and the next line HeaderSubject. I can make it easier by sorting the file in reverse but that takes a lot of time for the size and I usually have to do 3 weeks of files. So what I need is a sometype of logic if the HeaderTo matches get the previous line and the next line.
13 REPLIES 13
Pete Randall
Outstanding Contributor

Re: Awk or Perl advice

James,

From "Handy One-Liners for Sed" (attached):

# print 1 line of context before and after regexp, with line number
# indicating where the regexp occurred (similar to "grep -A1 -B1")
sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h


Pete

Pete
Graham Cameron_1
Honored Contributor

Re: Awk or Perl advice

You'll have to be careful with the ' and @ characters in your search string.

The following will do it:

awk '
/HeaderTo/ && /jmarrion@bcharrispub.com/ {
print lastline
print
getline
print
}
{lastline=$0}
' yourfile

-- Graham

Computers make it easier to do a lot of things, but most of the things they make it easier to do don't need to be done.
A. Clay Stephenson
Acclaimed Contributor

Re: Awk or Perl advice

Here's an intentionally verbose Perl method in case you need to add more logic:

#!/usr/bin/perl -w
use strict;
use constant TRUE => 1;
use constant FALSE => 0;

my ($s,$prev,$cont) = ('','',TRUE);
while (defined($s = ) && $cont)
{
if ($s =~ "^HeaderTo=jmarrion\@bcharrispub.com")
{
print $prev,$s;
if (defined($s = ))
{
print $s;
}
$cont = FALSE;
}
else
{
$prev = $s;
}
} # while
exit(0);

scan.pl < myfile
If it ain't broke, I can fix that.
Belinda Dermody
Super Advisor

Re: Awk or Perl advice


Thanks Graham, I am working with yours right now, but I have a question, I just looked at the log with the HeaderTo=, there could be a 'address' or '
' or 'Full Name' 'address'. I tried your awk and it didnt retrieve anything, although grep returned about 20 results of HeaderTo.
Graham Cameron_1
Honored Contributor

Re: Awk or Perl advice

James

The line

/HeaderTo/ && /jmarrion@bcharrispub.com/ {

will execute whenever it finds BOTH strings on a line.

Change the strings within the // delimiters to match what you need.
Or add more.
etc.

-- Graham
Computers make it easier to do a lot of things, but most of the things they make it easier to do don't need to be done.
H.Merijn Brand (procura
Honored Contributor
Solution

Re: Awk or Perl advice

Here's an intentionally short perl one-liner :)

lt09:/tmp 114 > cat file
a
b
c
d HeaderTo=jmarrion@bcharrispub.com
e
f
g HeaderTo=jmarrion@bcharrispub.com
h HeaderTo='jmarrion@bcharrispub.com'
i
j
k
l HeaderTo=jmarrion@bcharrispub.com
lt09:/tmp 115 > perl -ne'BEGIN{sub p{@x>1&&$x[-2]=~m{HeaderTo=[\x27]*jmarrion\@bcharrispub.com}and print@x}}END{push@x,"";p}push@x,$_;p;@x==3&&shift@x' file
c
d HeaderTo=jmarrion@bcharrispub.com
e
f
g HeaderTo=jmarrion@bcharrispub.com
h HeaderTo='jmarrion@bcharrispub.com'
g HeaderTo=jmarrion@bcharrispub.com
h HeaderTo='jmarrion@bcharrispub.com'
i
k
l HeaderTo=jmarrion@bcharrispub.com
lt09:/tmp 116 >

This prints the surrounding lines even if the pattern orrurs in two consequetive lines, with or without quotes

Enjoy, Have FUN! H.Merijn

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Hein van den Heuvel
Honored Contributor

Re: Awk or Perl advice

'nothr awk 'one-liner'. It only remembers a last from line, not all lines. Also tolerates extra lines between targetted text and may prevent false positives by anchorring at line start.
Minimally tested.

awk '/^HeaderFrom/{from=$0}
/^HeaderToj.*marrion@bcharrispub.com/{print from "\n" $0;subj=1;from=""}
(subj)&&/^HeaderSubj/{print $0 "\n";subj=0}' < message_file

Hein.
Kent Ostby
Honored Contributor

Re: Awk or Perl advice

Sorry to post this late. Forums were being wierd yesterday when I tried to post.

For my solution, you would create three files.

Your list of names that you are looking for would go into a file called targets with this format:

TARG yourname@yourcompany.com

You would also create a file called hfht.awk:

BEGIN{count=0;}
/^TARG / {count++;searchit[count]=$2;next}
count==0 {print "Error. No Targets specified";exit}
/HeaderFrom/ {saveline=$0;next;}
/HeaderTo/ {for (idx1 in searchit)
{if (index($0,searchit[idx1])>0) {hit=1; print saveline; print $0;} } }
/HeaderSubject/{if (hit==1) {print $0;print "----";hit=0;next}}

You would also create an executable file called hfht.sh:

cp targets hfht.use
cat searchfile >> hfht.use
awk -f hfht.awk < hfht.use


This will give you output like:

HeaderFrom bob
HeaderTo kmo@atl.hp.com
HeaderSubject Test
----
HeaderFrom bob2
HeaderTo kmo@atl.hp.com
HeaderSubject Test 2


Best regards,
Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Belinda Dermody
Super Advisor

Re: Awk or Perl advice

Thanks for all the replies, getting back to this site has been rough the past 20 hours. Procura was the only one that I was able to get the results that I wanted, all the other ones either had some kind of problems and the good old awk bail out errors (yes I checked the coding very closely). But I see I have one more to test out. Had to figure out also how to do it without unzipping the files. Combination of shell and perl.
Belinda Dermody
Super Advisor

Re: Awk or Perl advice

Procura, could you show me where I could put a print "\n" command in the string so I would have a blank line after each group of 3

Thanks in advance.
H.Merijn Brand (procura
Honored Contributor

Re: Awk or Perl advice

# perl -ne'BEGIN{sub p{@x>1&&$x[-2]=~m{HeaderTo=[\x27]*jmarrion\@bcharrispub.com}and print@x,"\n"}}END{push@x,"";p}push@x,$_;p;@x==3&&shift@x' file

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn
Belinda Dermody
Super Advisor

Re: Awk or Perl advice

Thanks, I was putting there and I was using a ; instead, my limited knowledge of perl. Thanks it is working perfectly.
Belinda Dermody
Super Advisor

Re: Awk or Perl advice

Thanks kent, looks good, but the first problem I came across was that you wanted me to cat the search file, well, these are messages files and they average 700mb each and I usually have to search 3 weeks of them, so the cating would duplicate the file take up extra space and add time to it. It takes about 12-16 minutes per log now with the perl, because I have zcat the file and pipe it in.