Operating System - HP-UX
1832854 Members
3153 Online
110047 Solutions
New Discussion

Awk Experts need assistance.

 
SOLVED
Go to solution
Belinda Dermody
Super Advisor

Awk Experts need assistance.

After reading and testing for a couple of days, figure I need to call in the experts again.

I have this monthly file 3million+ lines and I need to break it down into two files, lets say good.file and bad.file

The data looks like this
request:
response: 1wwwww@lafn.org

request:
response: 2 xxxxxxxxx@aya.yale.edu not
found

request:
response: 1 zzzzzz@dol.net

Every entry will have two lines, the first line is requesting a forwarding address, the second line is the return from the database with either the forwarding address line response:1 or a not found record response: 2. I am writing a monthly report of totals but I need to separate the two types to do further information gathering. So I would like to take the main file and separate it into two files with the request record and the matching reponse 1 or 2 record. One file for request and response 1 records and the other file request and response 2 records
8 REPLIES 8
Tim D Fulford
Honored Contributor

Re: Awk Experts need assistance.

perl would be more efficient at this as it could do it in one pass....

'/response/ && $2==2 {print $3}' infile > bad

'/response/ && $2==1 {print $3}' infile > good

or if you just want totals

' BEGIN{good=0; bad=0}; /response/ && $2==1 {good=good+1}; /response/ && $2==2 {bad=bad+1}; END {print "good", good, "bad", bad}' infile

my perl is not good enough to just write a script without testing it first.

Tim
-
James R. Ferguson
Acclaimed Contributor

Re: Awk Experts need assistance.

Hi:

Try this:

#!/usr/bin/sh
awk '/request/ {X=$0}
{if (/response: 1/) {print X"\n"$0 >> "/tmp/type1"}}
{if (/response: 2/) {print X"\n"$0 >> "/tmp/type2"}}
' myinput
exit 0

Regards!

...JRF...
James R. Ferguson
Acclaimed Contributor
Solution

Re: Awk Experts need assistance.

Hi (again) James:

For consistency of style, I would amend my first post to this:

#!/usr/bin/sh
awk '/request/ {X=$0}
/response: 1/ {print X"\n"$0 >> "/tmp/type1"}
/response: 2/ {print X"\n"$0 >> "/tmp/type2"}
' myinut
exit 0

Regards!

...JRF...
Belinda Dermody
Super Advisor

Re: Awk Experts need assistance.

Thanks Tim, my Perl is really basic, so I wouldnt know how to put the other parts in, I can get it to work using while read line, but it takes many hours to process the 3+ million lines.

Thanks James, (got to be in the name). Your first post had me with the good old awk errors of bail out and syntax errors. But your second post worked like a charm...
Leslie Chaim
Regular Advisor

Re: Awk Experts need assistance.

James,

Here is a quick way to do it in Perl, but I am using shell to split the lanes.

perl -n00e '
if ( /response: 1/ )
{
print STDOUT;
}
elsif ( /response: 2/ )
{
print STDERR
}
' myinut 1> good 2> bad

This reads your file in "paragraph mode" and printing to the desired filehandle.
If life serves you lemons, make lemonade
Belinda Dermody
Super Advisor

Re: Awk Experts need assistance.

Leslie, pretty neat, I didnt know you could do that within sh scripting. But I ran into a problem, it put all the data in the good file and non went into the bad (response: 2). I notice that the ; was missing after STDERR, but that didn't make a difference. Is white spacing a big thing.
Judy Traynor
Valued Contributor

Re: Awk Experts need assistance.

sed script I got from hP
/^[^ ]*$/{
N
s/\n */ /
}

save this and run sed against your file.

cat file | sed -f scriptname
> newfilename
This will take your original file and concatenate its 2 line output to one line.

Then you should be able to use grep to get what you want

good luck
Sail With the Wind
Leslie Chaim
Regular Advisor

Re: Awk Experts need assistance.

James,

I did a cut 'n' paste from your original post, and ran it:
----------------------------------------------------
$ cat myinut
request:
response: 1wwwww@lafn.org

request:
response: 2 xxxxxxxxx@aya.yale.edu not
found

request:
response: 1 zzzzzz@dol.net

$ head -1000 good bad
good: No such file or directory
bad: No such file or directory
$ perl -n00e '
> if ( /response: 1/ )
> {
> print STDOUT;
> }
> elsif ( /response: 2/ )
> {
> print STDERR
> }
> ' myinut 1> good 2> bad
$ head -1000 good bad
==> good <==
request:
response: 1wwwww@lafn.org

request:
response: 1 zzzzzz@dol.net


==> bad <==
request:
response: 2 xxxxxxxxx@aya.yale.edu not
found

$
----------------------------------------------------

Perl doesn't require a semi-colon if its the last statement of a BLOCK.
If life serves you lemons, make lemonade