Operating System - OpenVMS
1827794 Members
2338 Online
109969 Solutions
New Discussion

RDMS_MONITOR abnormal terminated

 
RomanceSu
Occasional Advisor

RDMS_MONITOR abnormal terminated

I found a system detach process "RDMS_MONITOR" occurs abnormal terminated from 14,MAY, but I dont find any error message from RDMMON.log.
My Openvms version is V7.2-1,and RDB version is V7.0-3.There are 384 single database file in system.
I have tried to increase user system resource as WSextent:64000->128000 & Pgflquo:250000->500000,but no any helpful.
How should I catch any status or message when RDMS_MONITOR abnormal terminated? or how should I do can resolve it?
13 REPLIES 13
marsh_1
Honored Contributor

Re: RDMS_MONITOR abnormal terminated

hi,

how do you know it has terminated abnormally , what message did you see to give you this impression ?
if you have no clear indication at present of why the process is dying arbitary changes of process quotas may only serve to muddy the waters, have you monitored the processes resource usage to make you believe this is the case ?
is rdmmon.log the log file specified when the detached process is run ?

hth
RomanceSu
Occasional Advisor

Re: RDMS_MONITOR abnormal terminated

mark hunt~
Thanks for your reply~
There is no any abnormal message in RDMMON.LOG about.
I guessed system resource not enough when the first time RDMS_MONITOR terminated occus.
I tried to wrote a DCL batch to detect the RDMS_MONITOR process & monitored the process RDMS_MONITOR resource usage by $>show process/acc/id=xxxx once 10 seconds.

But I was confused after serveral monitoring times,I found the process terminated was not at a highest resource usage situation when I checked a monitored result.

So I dont really know how to discover the real reason for the problem. I just know the event always occurs at a large of user do end-day batch.
marsh_1
Honored Contributor

Re: RDMS_MONITOR abnormal terminated

hi,

are you low on disk space for the log file ? have you checked in the operator.log for related messages.
some other things to check for are detailed at this link :-

http://fafner.dyndns.org/cgi-bin/conan.com?key=RMU70~Monitor&explode=yes&title=VMS%20Help&referer=

hth


John Gillings
Honored Contributor

Re: RDMS_MONITOR abnormal terminated

Check ACCOUNTING at about the time the process vanished to see if there's a termination status.

$ ACCOUNT/FULL/SINCE=date-time/BEFORE=date-time SYS$MANAGER:ACCOUNTNG.DAT /OUTPUT=accounting.lis

Search the listing file for any clues to your process.

Also look around for any dump files
A crucible of informative mistakes
Thomas Ritter
Respected Contributor

Re: RDMS_MONITOR abnormal terminated

Try to be really generous with quota.

Pgflquo:250,000->500,000,but no any helpful.

Try pqflquo 5 000 000.
Operator.log should have some information.
RomanceSu
Occasional Advisor

Re: RDMS_MONITOR abnormal terminated

Hi mark hunt,John Billings & Thomas Ritter~

Thank you guys!!

1.There is not insufficient space disk whick are operator.log & rdmmon.log on.
2.I just found a message in operator.log about RDMS_MONITOR terminated occurs time as following:

%%%%%%%%%%% OPCOM 1-JUN-2009 18:02:03.99 %%%%%%%%%%%
Message from user SYSTEM on E099
Event: Too Few Servers Detected from: Node LOCAL:.E099 DTSS,
at: 2009-06-01-18:02:03.998+08:00Iinf
Number Detected=0,
Number Required=1
eventUid 500218F9-4ED6-11DE-B7B7-453039392020
entityUid 38F4C6B2-47E6-11DE-8439-AA0004006304
streamUid 3851EAB9-47E6-11DE-841D-AA0004006304

3.There are messages in accountng.dat about event occurs time as following:

1-JUN-2009 18:01:58 PROCESS INTERACTIVE BR213 000624C2 TNA2588 10000001
1-JUN-2009 18:01:58 PROCESS INTERACTIVE BR006 00079B2A TNA1597 00000001
1-JUN-2009 18:01:58 PROCESS DETACHED SYSTEM 0006AA8A 00000001
1-JUN-2009 18:02:00 PROCESS INTERACTIVE BR006 0007CE88 TNA2714 10000001
1-JUN-2009 18:02:03 PROCESS BATCH BR227 0005D289 10010001
1-JUN-2009 18:02:05 PROCESS INTERACTIVE BR035 0006B5D1 TNA2121 00000001
1-JUN-2009 18:02:12 LOGFAIL 00074A8C TNA2716 00D38064
1-JUN-2009 18:02:12 PROCESS DETACHED SYSTEM 00050713 00000000
1-JUN-2009 18:02:13 PROCESS INTERACTIVE BR036 0007E931 TNA2391 10000001
1-JUN-2009 18:02:13 PROCESS INTERACTIVE BR202 00056E39 TNA2687 10000001
Thomas Ritter
Respected Contributor

Re: RDMS_MONITOR abnormal terminated

This message is from your time server. Probably has nothing to do with the abnormal terminations.

%%%%%%%%%%% OPCOM 1-JUN-2009 18:02:03.99 %%%%%%%%%%%
Message from user SYSTEM on E099
Event: Too Few Servers Detected from: Node LOCAL:.E099 DTSS,
at: 2009-06-01-18:02:03.998+08:00Iinf
Number Detected=0,
Number Required=1
eventUid 500218F9-4ED6-11DE-B7B7-453039392020
entityUid 38F4C6B2-47E6-11DE-8439-AA0004006304
streamUid 3851EAB9-47E6-11DE-841D-AA0004006304
Thomas Ritter
Respected Contributor

Re: RDMS_MONITOR abnormal terminated

The most likely reason for rdms_monitor crashing is quotas. Go big and work back.
Try an $ rmu/verify/log 'database_root' and see how that goes. Perform the verify at a quiet time because some locking is involved.

The RDMS_monitor must have reported why it died. Either operator.log or the RDB log.
RomanceSu
Occasional Advisor

Re: RDMS_MONITOR abnormal terminated

Thomas Ritter~

To verify database_root is difficult,because we have 384 database files.I dont really know have to monitor or verify which is database_root?
RomanceSu
Occasional Advisor

Re: RDMS_MONITOR abnormal terminated

To everyone~

Do you know what way can debug RDMS_MONITOR ?
Thomas Ritter
Respected Contributor

Re: RDMS_MONITOR abnormal terminated

Try $ rmu/show system
This will list all the current databases.
Do you see 384 databases ?
RomanceSu
Occasional Advisor

Re: RDMS_MONITOR abnormal terminated

no~ about 136
Brad McCusker
Respected Contributor

Re: RDMS_MONITOR abnormal terminated

Hi RomanceSu,

There are basically 3 places we look for clues in these situations:

+ The Rdb monitor log files (confirm that we are looking at the right ones)
+ Rdb monitor bugcheck files (could be in various locations)
+ The VMS accounting files.

The name and location of the RDMS monitor log files may be different, depending on the version of Rdb and whether Rdb was installed with the â multiversionâ option or if the monitor log file was redirected when the monitor was started.

First, letâ s confirm that we are looking at the right RDMS_MOITOR log files:

$ SET PROCESS/PRIV=WORLD
$ RMU/SHOW SYSTEM
Oracle Rdb V7.1-401 on node MYNODE 2-JUN-2009 08:40:29.62
- monitor started 19-APR-2009 01:42:57.55 (uptime 44 06:57:32)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON71.LOG;51"
- no databases are accessed by this node

In this case, the CURRENT monitor log file is SYS$SYSROOT:[SYSEXE]RDMMON71.LOG;51. You would want to check the prior versions ($SYSROOT:[SYSEXE]RDMMON71.LOG;-1). Please post the â tailâ of this log file:

$ type/tail/out=rdms_mon_tail.log

If the monitor process failed, it may have created a bugcheck file. These will be named RDM*BUG*.DMP and will be located in one of the following locations:

$ DIR/DATE SYS$SYSTEM:RDM*BUG*.DMP
$ DIR/DATE RDM$BUGCHECK_DIR:RDM*BUG*.DMP
$ DIR/DATE SYS$MANAGER:RDM*BUG*.DMP

If the monitor was running under some username other than SYSTEM, you will want to also check the login directory of that username.

Finally, as others have suggested, the VMS accounting file may include information about the process completion status. Unfortunately, unless you know the PID of the monitor process that FAILED, it may be difficult to identify it. However, the following should narrow it down:

First, confirm that you have DETACHED accounting enabled:

$ show accounting
$ Accounting is currently enabled to log the following activities:

PROCESS any process termination
INTERACTIVE interactive job termination
LOGIN_FAILURE login failures
SUBPROCESS subprocess termination
DETACHED detached job termination **** this needs to be enabled ***
BATCH batch job termination
NETWORK network job termination
PRINT all print jobs
MESSAGE user messages

If DETACHED is enabled, then you can issue the following command:

$ ACCOUNTING/FULL/USER=SYSTEM/PROCESS=DETACHED/SINCE=/OUT= RDM_MON_ACC.LOG

$ Search RDM_MON_ACC.LOG â Final statusâ ,â start time:â ,â finish time:â

(the â start timeâ and â finish timeâ are included as a way to â sanity checkâ the information â do the start/finish times match when you think the RDB monitor started/stopped?

If DETACHED accounting is NOT enabled, then you will need to enable it and wait for the next failure:
$ set accounting/enable=detached

For what it is worth, we have also seen the RDB monitor crash while encountering deadlocks with DBR (recovery processes).

Best Regards,

Brad McCusker
Software Concepts International
www.sciinc.com
Brad McCusker
Software Concepts International