1829559 Members
2129 Online
109992 Solutions
New Discussion

Re: shell script

 
SOLVED
Go to solution
vas  bolpali
Advisor

shell script

we have a data file as
followes. and bad records starts with #@#

$cat data.txt
ABCD 2044440A0023C637
ABCD 2044440B0023C638
#@# 23C63A - 3rd record is bad
ABCD 63744440X023C63B
ABCD 22044440Y023C63C
ABCD 2044740B0023C638
#@# 23C63A - 7th record is bad
ABCD 2033440A0023C637

->| |<-

i need to write a script to log
the info in a file (error.log)

1. which will tell/display the position of
bad record (starts with #@#)

2. and i need to know the
before after records of bad record.

3. I need only from 8 to 12 (length) of those
before and after records ( as shown in arrows)

basically, I have to generate the error log
for the above data file as followes.

$cat error.log
postion before after
3 2044 6374
7 2044 2033


check the attachment for datafile .
Thnx in advance
Vasu
keeping you ahead of the learning curve
23 REPLIES 23
harry d brown jr
Honored Contributor

Re: shell script

how about something like this:

#!/usr/bin/ksh
awk 'BEGIN {prevhit=0;prevline="";lineno=0}
{
lineno+=1;
if (match($0,/^\#\@\#/)) {
print lineno-1,prevline;
print lineno,$0;
prevhit=1;
} else {
if (prevhit) {
print lineno,$0;
prevhit=0;
}
}
prevline=$0;
}
END {if (prevhit) {print "endofdatafile";}}'

live free or die
harry
Live Free or Die
harry d brown jr
Honored Contributor

Re: shell script

Damn this site is very slow today! Anyways I forgot to add the out sample:

# cat data.txt | ./data.awk
2 ABCD 2044440B0023C638
3 #@# 23C63A - 3rd record is bad
4 ABCD 63744440X023C63B
6 ABCD 2044740B0023C638
7 #@# 23C63A - 7th record is bad
8 ABCD 2033440A0023C637


live free or die
harry
Live Free or Die
vas  bolpali
Advisor

Re: shell script

Thnx for the reply.
you are almost close

DATA=data.txt
ERR=error.log

the error.log must have 3 columns.

1 column- bad record position in the data file
2 column- before record ( may be full or substr)
3 column- after record ( may be full or substr)


the error.log out put should be like this

3 2044 6374
7 2044 2033

vasu.
see the exact datafile in the attachment
keeping you ahead of the learning curve
vas  bolpali
Advisor

Re: shell script

Thnx for the reply.
you are almost close

DATA=data.txt
ERR=error.log

the error.log must have 3 columns.

1 column- bad record position in the data file
2 column- before record ( may be full or substr)
3 column- after record ( may be full or substr)


the error.log out put should be like this

3 2044 6374
7 2044 2033

vasu.
see the exact datafile in the attachment
keeping you ahead of the learning curve
Sridhar Bhaskarla
Honored Contributor

Re: shell script

Hi Vasu,

Try this one. Replace datafile with your file and logfile with the output file. Also change cut statement based on how many chars you want.

-Sri

#!/usr/bin/ksh
DATA=datafile
LOG=logfile
LINE=1
printf "%-10.10s %-10.10s %-10.10s \n" POSITION BEFORE AFTER > $LOG
printf "%-10.10s %-10.10s %-10.10s \n" -------- ------ ----- >> $LOG
printf "\n" >> $LOG
while read ENTRY
do
PREV=0
NEXT=0
echo $ENTRY |grep "#@#"
if [ $? = 0 ]
then
(( PREV = $LINE - 1 ))
(( NEXT = $LINE + 1 ))
POSITION="$LINE"
BEFORE=$(sed -n ''$PREV'p' $DATA |cut -c 6-9)
AFTER=$(sed -n ''$NEXT'p' $DATA|cut -c 6-9)
printf "%-10d %-10d %-10d \n" $LINE $BEFORE $AFTER >> $LOG
fi
(( LINE = $LINE + 1 ))
done < $DATA

You may be disappointed if you fail, but you are doomed if you don't try
Rodney Hills
Honored Contributor

Re: shell script

Create a perl script-

$good=" *BOF";
while(<>) {
chomp;
if (/^#\@#/) { push(@bad,$.); }
dump();
$curr=substr($_,7,5);
$good=$curr;
}
$curr=" *EOF";
dump();

sub dump {
if (scalar @bad) {
foreach $badline (@bad) {
print "$badline $good $curr\n";
}
undef @bad;
}
return;
}

----------------
Then run-
perl yourscript datafile >error.log

This will handle if their are mutiple bad lines in a row.
It will also handle if the first or last line is bad.

HTH

-- Rod Hills
There be dragons...
Sridhar Bhaskarla
Honored Contributor

Re: shell script

couple of minor changes.

Replace

echo $ENTRY |grep "#@#"

with

echo $ENTRY |grep "#@#" > /dev/null 2>&1

Remove

POSITION="$LINE"

I didn't use it.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Rodney Hills
Honored Contributor

Re: shell script

Ignore previous perl script, this is the correct script I meant to enter-

$good=" *BOF";
while(<>) {
chomp;
if (/^#\@#/) { push(@bad,$.); }
else {
$curr=substr($_,7,5);
dump();
$good=$curr;
}
}
$curr=" *EOF";
dump();

sub dump {
if (scalar @bad) {
foreach $badline (@bad) {
print "$badline $good $curr\n";
}
undef @bad;
}
return;
}

-- Rod Hills
There be dragons...
john korterman
Honored Contributor
Solution

Re: shell script

Hi Vasu,
try the attached script, using your input file as $1.
You can modify the STRING variable if the requested chars are not correct.

regards,
John K.

it would be nice if you always got a second chance
Jean-Louis Phelix
Honored Contributor

Re: shell script

hi,

Perhaps a shorter one ...

#!/usr/bin/sh
nl -s^V data.txt | awk -F^V '{
if (match($2,/#@#/))
{
printf "%d %s", $1, substr(prev, 8, 5)
getline
printf " %s\n", substr($2, 8, 5)
}
prev=$2
}' > error.log

Regards.
It works for me (© Bill McNAMARA ...)
harry d brown jr
Honored Contributor

Re: shell script

Vasu,

Try this, I modified the data a little in case the FIRST or LAST record was bad:

# cat data.txt
#@# 23C63A - 1 st record is bad
ABCD 2044440B0023C638
#@# 23C63A - 3rd record is bad
ABCD 63744440X023C63B
ABCD 22044440Y023C63C
ABCD 2044740B0023C638
#@# 23C63A - 7th record is bad
ABCD 2033440A0023C637
ABCD 2044740B0023C638
#@# 23C63A - 10th record is bad



# cat data.awk
#!/usr/bin/ksh
awk 'BEGIN {
prevline=""; prevdata=""; nextdata=""; lineno=0; badlineno=0;
}
{
lineno+=1;
if (match($0,/^\#\@\#/)) {
badlineno=lineno;
prevdata=substr(prevline,6,4);
if (prevdata == "") prevdata="none";
} else {
if (badlineno) {
nextdata=substr($0,6,4);
print badlineno,prevdata,nextdata;
badlineno=0;
prevdata="";
nextdata="";
}
}
prevline=$0;
}
END {if (badlineno) {print badlineno,prevdata,"none";}}'


# cat data.txt | ./data.awk
1 none 2044
3 2044 6374
7 2044 2033
10 2044 none
#



live free or die
harry
Live Free or Die
vas  bolpali
Advisor

Re: shell script

I gave only the basic info .

but john korterman 's script does exactly what i need in full .

Sridhar gave me exactly , what i asked ,
but in reality it went up to 80% only .

and thanks all of you .

Vasu.

keeping you ahead of the learning curve
vas  bolpali
Advisor

Re: shell script

hi John,
The shell script is not handling , if all the records are bad.
check the attachment for tha data file.
keeping you ahead of the learning curve
harry d brown jr
Honored Contributor

Re: shell script

Try this:

#!/usr/bin/ksh
awk 'BEGIN {
prevline=""; prevdata=""; nextdata=""; lineno=0; badlineno=0; goodlineno=0;
}
{
lineno+=1;
if (match($0,/^\#\@\#/)) {
badlineno=lineno;
prevdata=substr(prevline,6,4);
if (prevdata == "") prevdata="none";
} else {
goodlineno=lineno;
if (badlineno) {
nextdata=substr($0,6,4);
print badlineno,prevdata,nextdata;
badlineno=0;
prevdata="";
nextdata="";
}
}
prevline=$0;
}
END {
if (badlineno) {
if (goodlineno) {
print badlineno,prevdata,"none";
} else {
print "All lines are bad";
}
}
}'

live free or die
harry
Live Free or Die
vas  bolpali
Advisor

Re: shell script

no ,
if all records are bad , still i need err
report/output like this .
(assuming total 50 ,
and all are bad)


1 null null
2 null null
3 null null
......
50 null null


keeping you ahead of the learning curve
Rodney Hills
Honored Contributor

Re: shell script

If you go back to my perl script above, it will handle when all lines are bad.

1 *BOF *EOF
2 *BOF *EOF
...
50 *BOF *EOF

The only difference is I used *BOF and *EOF to represent beginning file and ending file.

-- Rod Hills
There be dragons...
vas  bolpali
Advisor

Re: shell script

hi Rod
u r perl script (perlx) is giving errors

$ perl perlx data.txt > error.log

syntax error in file perlx at line 8, next 2 tokens "dump("

syntax error in file perlx at line 13, next 2 tokens "dump("

Execution of perlx aborted due to compilation errors.

Vasu
keeping you ahead of the learning curve
Rodney Hills
Honored Contributor

Re: shell script

Whoops, I think "dump" is a reserved name in perl. Try renaming "dump" to "dispit".

-- Rod Hills
There be dragons...
harry d brown jr
Honored Contributor

Re: shell script

Try this attached awk script:


live free or die
harry
Live Free or Die
harry d brown jr
Honored Contributor

Re: shell script

Of course I'd prefer perl (I just used a2p (awk to perl translator) and stripped out the crap and modified the substr's (perl strings start at position 0 whereas awk starts at 1))

If your input files are huge then you should use perl for speed!

#!/opt/perl/bin/perl

$prevline = '';
$prevdata = '';
$nextdata = '';
$lineno = 0;
$badlineno = 0;
$goodlineno = 0;
$badlines = '';

while (<>) {
chomp; # strip record separator
#
# increment line number
#
$lineno += 1;
#
# does it match our pattern?
#
if ($_ =~ /^\#\@\#/ ) {
#
# save the bad line number
$badlineno = $lineno;

#
# save the previous lines data
$prevdata = substr($prevline, 5, 4);
#
if ($goodlineno) {
$badlines = '';
} else {
# in case we have a long run of bad lines - like
# the entire file, save this special string for
# printing later - hopefully the input file isnt
# huge or this program will blow chunkies
$badlines = $badlines . $lineno . " null null\n";
}
#
# if the previous data is null then change the word to "none"
if ($prevdata eq '') {
$prevdata = 'none';
}
} else {
#
# save the good line number
$goodlineno = $lineno;
#
# if we previously had a bad line then report the data
if ($badlineno) {
#
# save the current lines data for potential use
# the next time we encounter a bad record
$nextdata = substr($_, 5, 4);
print $badlineno, " ", $prevdata, " ", $nextdata, "\n";
$badlineno = 0;
$prevdata = '';
$nextdata = '';
}
}
$prevline = $_;
}

if ($badlineno) {
if ($goodlineno) {
print $badlineno, " ", $prevdata, " none\n";
} else {
print $badlines;
}
}


live free or die
harry
Live Free or Die
john korterman
Honored Contributor

Re: shell script

Hi again Vasu,

The attached script can hopefully do what is requested.
However, two variables TEMP_OUT1 and TEMP_OUT2 must first be configured. TEMP_OUT1 should define a path to a file whose size will be a little bigger than your input file. TEMP_OUT1 should hold all lines, each line starting with the line number itself.
TEMP_OUT2 should define a path to a file containing the errors only. That said, you also get an idea of amount of space required.
The idea of the script is to read through the error file (TEMP_OUT2) and for each error look up the requested values in TEMP_OUT1 by looking at the lines in the vicinity of the error line.
The script requires your input file as $1. Because of the temporary files, it is rather slow, but it worked on the last input sample you attached to this thread. If it does not work, please attach the input you use.

regards,
John K.
it would be nice if you always got a second chance
vas  bolpali
Advisor

Re: shell script

john,
script is 99% working .


i don't need the begining 2 lines in the output.
which is in the write header part.
by commenting echo "pos.\tbefore\tafter"
i can eliminate 1 line.
but i am unable to eliminate
the another line which is nothing but space.


for example
u r output is like this
----------------------
pos. before after

1 null 464
3 838 null


I want output is like this
-------------------------
1 null 464
3 838 null



the output should start with data ( no spaces , no blank lines, because I have use this output to update an oracle table, by passing thease output values as parameters)

thank you.
Vasu
keeping you ahead of the learning curve
john korterman
Honored Contributor

Re: shell script

Hi Vasu,
please try the revised attached script.

regards,
John K.
it would be nice if you always got a second chance