Operating System - HP-UX
1833926 Members
3182 Online
110063 Solutions
New Discussion

Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

 
Cory Gaetz(1)
Advisor

Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Hi, and thanks for taking the time to try and help me out.

I have a two node HP-UX 11.23 cluster running Serviceguard 11.17, with SGeRAC installed and the Enterprise Cluster Toolkit.

There is a single RAC database running with an instance on each node.

My question is how do I get Servicguard to monitor the listeners on the two nodes (LISTENER_8A and LISTENER_8B) and restart them if they fail?

All I've seen referenced is having a package for each listener, but am uncertain how to set this up.

Would I be better off letting CRS monitor the listeners as Oracle suggests?

Thanks again, and let me know if I've left out any information.
19 REPLIES 19
Oviwan
Honored Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

hey

yes, let clusterware monitor them. oracle provide a listener script for clusterware, which restarts the listener if it fail.

script page 22, act_listener.pl:
http://www.oracle.com/technology/products/database/clusterware/pdf/SI_DB_Failover_11g.pdf

Regards
Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Some more information would probably be helpful. In the package configuration (.conf) file I have the following:

SERVICE_NAME LISTENER
SERVICE_FAIL_FAST_ENABLED NO
SERVICE_HALT_TIMEOUT 240

In the package script I have the following:

SERVICE_NAME[1]="LISTENER"
SERVICE_CMD[1]="/etc/cmcluster/odsp-cfs/toolkit.sh monitor_listener"
SERVICE_RESTART[1]="-r 2"

When I start up the package, the package log file records this:

########### Node "sodb8a": Starting package at Wed Aug 20 14:02:11 MDT 2008 ###########
Aug 20 14:02:12 - Node "sodb8a" *** /etc/cmcluster/odsp-cfs/toolkit_dbi.sh called with start argument. ***
Aug 20 14:02:12 - Node "sodb8a" : Starting Oracle RAC Instance at Wed Aug 20 14:02:11 MDT 2008
Aug 20 14:02:12 - Node "sodb8a" srvctl start instance -d odsitlab -i odsitlab1 -o open -c "/ as sysdba"
Aug 20 14:02:30 - Node "sodb8a" Successfully started instance.
Aug 20 14:02:30 - Node "sodb8a": Starting service ORACLE_ODS_RAC using
"/etc/cmcluster/odsp-cfs/toolkit_dbi.sh check"
Aug 20 14:02:30 - Node "sodb8a": Starting service LISTENER using
"/etc/cmcluster/odsp-cfs/toolkit.sh monitor_listener"

########### Node "sodb8a": Package start completed at Wed Aug 20 14:02:30 MDT 2008 ###########
Aug 20 14:02:30 - Node "sodb8a" *** /etc/cmcluster/odsp-cfs/toolkit.sh called with monitor_listener argument. ***
Aug 20 14:02:30 - Node "sodb8a" Oracle Listener LISTENER_SODB8A Restart Failed!
Aug 20 14:02:30 - Node "sodb8a" *** /etc/cmcluster/odsp-cfs/toolkit.sh called with monitor_listener argument. ***
Aug 20 14:02:31 - Node "sodb8a" Oracle Listener LISTENER_SODB8A Restart Failed!
Aug 20 14:02:31 - Node "sodb8a" *** /etc/cmcluster/odsp-cfs/toolkit.sh called with monitor_listener argument. ***
Aug 20 14:02:31 - Node "sodb8a" *** /etc/cmcluster/odsp-cfs/toolkit_dbi.sh called with check argument. ***
Aug 20 14:02:31 - Node "sodb8a" Oracle Listener LISTENER_SODB8A Restart Failed!

########### Node "sodb8a": Halting package at Wed Aug 20 14:02:31 MDT 2008 ###########
Aug 20 14:02:31 - Node "sodb8a": Halting service ORACLE_ODS_RAC
Aug 20 14:02:31 - Node "sodb8a": Halting service LISTENER
cmhaltserv: Service name LISTENER is not running.
Aug 20 14:02:32 - Node "sodb8a" *** /etc/cmcluster/odsp-cfs/toolkit_dbi.sh called with stop argument. ***
Aug 20 14:02:32 - Node "sodb8a" : Stopping Oracle RAC Instance at Wed Aug 20 14:02:31 MDT 2008
Aug 20 14:02:32 - Node "sodb8a" srvctl stop instance -d odsitlab -i odsitlab1 -o immediate -c "/ as sysdba"
CSS appears healthy
Aug 20 14:03:03 - Node "sodb8a" Successfully stopped instance.

It appears that the monitor thinks the listener isn't running and attempts to restart it multiple times before halting the package.


Oviwan
Honored Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

so check the toolkit.sh script. may be the command line with the "ps" command is wrong. I already saw scripts with some errors on this command.

try to debug the toolkit.sh script. to get debug information set "set -x" in the script.

and start it with the "monitor_listener" parameter.

HTH
Deepak Kr
Respected Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

>>>Aug 20 14:02:30 - Node "sodb8a" *** /etc/cmcluster/odsp-cfs/toolkit.sh called with monitor_listener argument. ***
Aug 20 14:02:30 - Node "sodb8a" Oracle Listener LISTENER_SODB8A Restart Failed!

package config is ok.

All you need to debug the error details in

toolkit.sh script.

If possible can you attach it here..
"There is always some scope for improvement"
Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

I have put set -x in the three scripts that seem to be called for this, toolkit.sh, haoracle.sh, halistener.mon.

Here is the output when I run ./toolkit.sh monitor_listener with the above set.

Thanks

[root@sodb8a]# ./toolkit.sh monitor_listener
+ ACT=monitor_listener
+ PKG_DIR=.
+ HA_SERVER=ORACLE
+ HA_SCRIPT=./haoracle.sh
+ invoke_app
+ PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/lbin
+ ORA_ver=9i
+ + hostname
HOST=sodb8a
+ + date
DATE=Wed Aug 20 15:56:52 MDT 2008
+ PKG_DIR=.
+ MAINTENANCE_FILE=./oracle.debug
+ exit_code=0
+ [[ -f ./haoracle.conf ]]
+ . ./haoracle.conf
+ ORACLE_HOME=/u00/oracle/product/10.2.0/db
+ ORACLE_ADMIN=oracle
+ SID_NAME=odsitlab1
+ LISTENER=yes
+ LISTENER_NAME=LISTENER_SODB8A
+ LISTENER_PASS=
+ PFILE=/u00/oracle/product/10.2.0/db/dbs/initodsitlab1.ora
+ set -A MONITOR_PROCESSES
+ MONITOR_PROCESSES[0]=ora_pmon_odsitlab1
+ MONITOR_PROCESSES[1]=ora_dbw0_odsitlab1
+ MONITOR_PROCESSES[2]=ora_ckpt_odsitlab1
+ MONITOR_PROCESSES[3]=ora_smon_odsitlab1
+ MONITOR_PROCESSES[4]=ora_lgwr_odsitlab1
+ MONITOR_PROCESSES[5]=ora_reco_odsitlab1
+ MAINTENANCE_FLAG=yes
+ MONITOR_INTERVAL=30
+ TIME_OUT=30
+ [ -z /u00/oracle/product/10.2.0/db ]
+ [ ! -d /u00/oracle/product/10.2.0/db ]
+ PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/lbin:/u00/oracle/product/10.2.0/db/bin:
+ [ -z oracle ]
+ [ -z odsitlab1 ]
+ + check_yes_no LISTENER yes yes
LISTENER=yes
+ check_status=0
+ [ 0 -ne 0 ]
+ [ yes = yes -a -z LISTENER_SODB8A ]
+ [ -z /u00/oracle/product/10.2.0/db/dbs/initodsitlab1.ora ]
+ [ ! -f /u00/oracle/product/10.2.0/db/dbs/initodsitlab1.ora ]
+ + check_yes_no MAINTENANCE_FLAG yes ./oracle.debug
MAINTENANCE_FILE=./oracle.debug
+ check_status=0
+ [ 0 -ne 0 ]
+ [ -z 30 ]
+ [ -z 30 ]
+ log_message *** ./toolkit.sh called with monitor_listener argument. ***
Aug 20 15:56:52 - Node "sodb8a" *** ./toolkit.sh called with monitor_listener argument. ***
+ . ./halistener.mon
+ set -x
+ [ yes = yes ]
+ listener_monitor
Aug 20 15:56:53 - Node "sodb8a" Oracle Listener LISTENER_SODB8A Restart Failed!
Oviwan
Honored Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

take a look at these lines:
+ [ -z odsitlab1 ]
+ + check_yes_no LISTENER yes yes
LISTENER=yes
+ check_status=0

I thinks "check_yes_no" is a function, please have a look at this function, this sould not return 0.

attach the whole script...
Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Unfortunately scripting is not my strongest point, but these scripts are provided straight from HP. Silly me I guess I just expected them to work. That function is called in haoracle.sh.

Here are the 3 scripts that I referenced above.
Oviwan
Honored Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

change this line
su - ${ORACLE_ADMIN} -c /usr/bin/sh </dev/null 2>&1
to
su - ${ORACLE_ADMIN} -c /usr/bin/sh </tmp/listener_debug.log 2>&1
in halistener.mon
and after starting the package again provide the /tmp/listener_debug.log file.

and check also your listener.ora whether there is a listener configured called LISTENER_SODB8A.

can you manually start the database and listener without starting the package?
Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

The database and listener were starting fine within the package until I tried to start monitoring the listener.

After that when the package would fail, the listener would still be running:

oracle 4561 1 0 15:55:32 ? 0:00 /u00/oracle/product/10.2.0/db/bin/tnslsnr LISTENER_SODB8A -inherit


I can start and stop the database and listener manually outside of the package without a problem.

Attached is the requested .log file.

Thanks for all your help to this point.
Oviwan
Honored Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

please set "set -x" after this line:
su - ${ORACLE_ADMIN} -c /usr/bin/sh </tmp/listener_debug.log 2>&1
in halistener.mon to get more debug infos. and post the log file again.

Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Thanks, here is the new log file. Not much more information in it, but hopefully something useful.
Oviwan
Honored Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

this time it was running? but I think you don't have set "set -x" to the right place:
function listener_monitor
{
set -x #<--here
local maintenance=0;

if [ -z "$MAINTENANCE_FILE" ] || [ ! -f "$MAINTENANCE_FILE" ]; then

su - ${ORACLE_ADMIN} -c /usr/bin/sh </tmp/listener_debug.log 2>&1
set -x #<--here
exit_code=0
export ORACLE_HOME=${ORACLE_HOME}
export ORACLE_SID=${SID_NAME}

then if error occurs provide listener_debug.log and the output of ./toolkit.sh monitor_listener
Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Thanks oviwan, no it still isn't working, I hope I've set the "set -x" properly now. In the attachment is the listener_debug.log as well as the output from toolkit.sh monitor_listener.

Thanks again for your time and effort on this.
Deepak Kr
Respected Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Hi Cory,

Can you copy paste scripts in txt files and attach here as I am not able to open tar file...

I want to debug it at my place...

regds...Kumar
"There is always some scope for improvement"
Deepak Kr
Respected Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Can you try this:

SERVICE_NAME[1]="LISTENER"
SERVICE_CMD[1]="/etc/cmcluster/odsp-cfs/haoracle.sh monitor_listener"
SERVICE_RESTART[1]="-r 2"
"There is always some scope for improvement"
Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Hi Kumar,

Sorry I didn't get back earlier, but I had a long weekend. I'm assuming that you were able to open up the .tar, if not let me know. I tried changing it to the haoracle.sh script but the same thing happened.

I opened up a case with HP early last week and they are finally getting back to me, I'll let you know what I find out. In the mean time, any other help is greatly appreciated.
Deepak Kr
Respected Contributor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

Hi,

have you heard anything from support yet?
"There is always some scope for improvement"
Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

I've been working with support, and one of the issues was a mixture of scripts between version 1.01 and 4.01. I've resolved that and am only running the 4.01 scripts but am still having the same problem. Once I know the resolution I will definitely post it here.
Cory Gaetz(1)
Advisor

Re: Serviceguard to monitor Oracle Listener in RAC Configuration (HP-UX 11.23)

As it turns out Oracle CRS was already monitoring the Listener and may have been interfering with the ECMT scripts and setup. As we needed to get into production, we simply left CRS monitoring the Listener and went from there. HP Support said that everything was configured properly, and suggests that it was CRS not playing properly with the ECMT scripts. Thanks for your help.