cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle Alert Log Monitoring

Ralph Grothe
Honored Contributor

Oracle Alert Log Monitoring

Hello Oracle enchanted|plagued,

on one of our Oracle clusters I was asked to monitor alert.logs of the running instances thereon.
Sounds to me to be some buggy dbs there,
but I lack the DBA knowledge to judge on them.

Being a bit of a Perl nerd I of course had to look for a Perl solution and stumbled over Swatch.

http://swatch.sourceforge.net/

It only took a wee wrapper script to have all 14 alert logs of the cluster's instances being successfully monitored and let the DBAs be bombarded with notifications about their skeletons in their cupboards ;-)

However, I wasn't savvy enough to take care for the Swatch daemon procs being started and stopped by the MC/SG package control scripts.

This neglegance haunted me back last Sunday when our network admins flashed their switches with new firmware.
They even warned us that the procedure required a full stack reboot, including the failover ports (that much for HA ;-)

Unfortunately thus my SMS alert notification thus also was broken %-{
So this morning I was given a nice reception of compiled user anger because the cluster couldn't failover or correctly reform after the incident yesterday owe to inability to unmount the packages' filesystems whereon my Swatch daemons were dutifully monitoring the alert logs.
Looks that even the "fuser -ku" in the unmount_fs() function from the control scripts couldn't shoot them.

Of course I'm convinced that this won't happen again if I start and stop (or emrgency shoot) the daemons from within the control scripts.
Nevertheless, I also have a sneaking suspicion that the Oracle Data Directory already provides for dynamic views where all Oracle alert events that usually get logged to the alert.logs could as well be SQL queried.
If that's true which view(s) would I have to look up?
This would avoid the need to access the packages' filesystems altogether.

While searching the web for such thing I also came accross this interesting link.

http://www.oracleadvice.com/Tips/external.htm

Do you think I can disregard the explicit warning there not to use the above hack in a productive environment?

Regards
Ralph
Madness, thy name is system administration
7 REPLIES

Re: Oracle Alert Log Monitoring

Hi,

You can find some of the information written to the alert-log in the data dictionary (like log-switches etc.), but no "ORA-" type errors will be written to the data dictionary. SO to catch them you need to read the alert-log.

hth
/M
Alexander M. Ermes
Honored Contributor

Re: Oracle Alert Log Monitoring

Hi Ralph.
If you need a checker for alertlogs,
pls have a look :

#!/bin/ksh
# ==========================================================================
# check_alertlog EURCOD-A
#
# Description:
# This script will check all alert log files of all databases
# as specified in the /etc/oratab. For use primarily
# when the computer has been booted.....
#
# Parameters:
# 1. Database name.
#
# History:
#
# ==========================================================================
# Allow verification/trace to be turned on and off.
#
# set -E
# set -x
#
case "$ORACLE_TRACE" in
# T) set -x ;;
esac
# setting the proper PATH.....
LBIN=/usr/local/bin; export LBIN
PATH=$PATH:$LBIN; export PATH
DB_ADMIN=/home/dba/oracle/admin; export DB_ADMIN
#
Db_Name=`echo ${1:-"none"}`; export Db_Name
#
#
echo "#\n\tDb_Name set to: $Db_Name\n#\n"
#
echo "Time is: `date`" > /var/tmp/check_alert_log.log

cat /etc/oratab | while read LINE
do
case $LINE in
\#*) ;; #comment-line in oratab
*)
# Proceed only if third field is 'Y'.
if [ "`echo $LINE | awk -F: '{print $3}' -`" = "Y" ] && [ "${Db_Name}" = "none" ]
then
ORACLE_SID=`echo $LINE | awk -F: '{print $1}' -`
oracle_sid=`echo $ORACLE_SID|tr "[A-Z]" "[a-z]"`
echo ${ORACLE_SID} >> /var/tmp/check_alert_log.log
echo " " >> /var/tmp/check_alert_log.log
cat ${DB_ADMIN}/${ORACLE_SID}/bdump/alert_${ORACLE_SID}.log | grep 'ORA-' >> /var/tmp/check_alert_log.log
echo " " >> /var/tmp/check_alert_log.log
fi
esac
done
echo "Time is: `date`" >> /var/tmp/check_alert_log.log
########################################################
cat /var/tmp/alert_log.log | sendmail .......


Repeated all 60 minutes by cron .
Rgds
Alexander M. Ermes
.. and all these memories are going to vanish like tears in the rain! final words from Rutger Hauer in "Blade Runner"
Indira Aramandla
Honored Contributor

Re: Oracle Alert Log Monitoring

Hi Ralph,

To can monitor the alert logs by reading the alertlogs and serching for errors by a script and schedule this monitoring script for every xx minutes.

You can have the script to check for a single database with a parameter as ORACLE_SID or do for all the database with the flag "Y" (active databases) on oratab file.

Attached is a script that looks for any errors in the alert log and each time it scans form the next line of the previous reported error line. And the script uses mailx to mail to the DBA the error message and the database name. This script does it for a database by acepting a parameter, you can have the same thing in a loop for all the active databases in oratab file.


Indira A
Never give up, Keep Trying
Yogeeraj_1
Honored Contributor

Re: Oracle Alert Log Monitoring

hi Ralph,

another script for monitoring my alert.log is attached.

the information on the site that you have mentioned above is also a good one.

Don't consider it as a negative "hack" because this allows you to query your alert.log from anywhere and also you can delegate this task to somebody else. More important you can use it to analyze the types of errors that you get most frequently, etc.

I think i will adopt this procedure in the near future...

regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Yogeeraj_1
Honored Contributor

Re: Oracle Alert Log Monitoring

attachment
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)

Re: Oracle Alert Log Monitoring

You have one more option also, and that's using Oracle Enterprise Manager (comes with the server install). You would have to configure a Management server somewhere, and then you start one more processs (Intelligent Agent) on each node, and from within OEM you can set up events that will monitor the alertlog (and pretty much anything else you would like to know about). Should an event be triggered, for instance by a "ORA-something" error in the alertlog, you can have a mail/page/sms sent to your DBA's.

/M
renarios
Trusted Contributor

Re: Oracle Alert Log Monitoring

If you install OEM please install the repositry and management server on a different node, because if your production envirmonment is down, OEM doesn't work as well.
OEM is also very handy for ad-hoc Oracle stuff.

Cheers,

Renarios
Nothing is more successfull as failure