Operating System - HP-UX
1755472 Members
4536 Online
108833 Solutions
New Discussion юеВ

Monitored process for oracle in SG

 
SOLVED
Go to solution
James M. Dunn
Frequent Advisor

Monitored process for oracle in SG

Does Anyone know where SG stores the "monitored PID's" of a monitored package in this case its an oracle database? Is it kept in memory or maybe a file or in SG binary or something more creative.

Note below, database was failed due to a monitored process going away or being killed, but as you can see this proc 11636 was not a monitored pid in the list, It was a monitored pid in a previous running of the database same database.

Monitored process = ora_pmon_SDP4, pid = 15945
Monitored process = ora_dbw0_SDP4, pid = 15948
Monitored process = ora_ckpt_SDP4, pid = 15962
Monitored process = ora_smon_SDP4, pid = 15964
Monitored process = ora_lgwr_SDP4, pid = 15960
Monitored process = ora_reco_SDP4, pid = 15966
Monitored process = ora_arc0_SDP4, pid = 15968
kill: 11636: The specified process does not exist.

Thanks in advance,

Jim D
3 REPLIES 3
John Bigg
Esteemed Contributor
Solution

Re: Monitored process for oracle in SG

What version of the toolkit are you running?

The pid's of the processes are held in a shell script array MONITOR_PROCESSES_PID[] which is derived from the list of monitored processes MONITOR_PROCESSES[].

Looking at the monitor_processes function I'm not sure I understand how it could possibly kill a pid from a previous session. Was there any further output after the output you show? If a monitored process had failed like this I would have expected more output directly after this with a full ps -ef and then a message like:

*** XXX has failed. Aborting Oracle. ***

Did you get this?

It is possible that the message:

kill: 11636: The specified process does not exist.

did not actually come from the monitor_processes function at all but from a completely different area or even script since the output from all processes will get merged into the same log file.

Maybe the oracle shutdown (or some other script) caused by the failure issued a kill of an old process which was stored in a file somewhere?

How was the database which was using pid 11636 shutdown and re-started? Was Serviceguard pacakage or Serviceguard itself re-started at this time? Maybe you could attach the whole package log file and give us a little more information about exactly what was going on here.
Stephen Doud
Honored Contributor

Re: Monitored process for oracle in SG

The monitor_process function in haoracle.sh does not perform a kill. Secondly, note the process ID - it is one thousand PIDs away from all the other processes that are being monitored. Do you have another instance being controlled by a different package, that had a problem at the same time?
Depending on the version of the ECM toolkit used, a patch exists to counter problems with the haoracle.sh had with locating the correct PIDs to monitor.

The patch text states:
"Oracle package may fail if there are multiple instances of Oracle running with one of the SID names being a subset of another."

See
PHSS_31084 s700_800 11.23 Enterprise Cluster Master Toolkit B.02.21
PHSS_31083 s700_800 11.11 Enterprise Cluster Master Toolkit B.02.20
James M. Dunn
Frequent Advisor

Re: Monitored process for oracle in SG

John, Stephen,

You both are correct 10 points for each. I called in a ticket with HP and yes, the these two scripts in the toolkit have to be used.

This patch only adds these two scripts too
/opt/cmcluster/toolkit/oracle

PHSS_31083
haoracle.mon
haoracle.sh

Thanks for the help.

Jim D