Operating System - OpenVMS
1829556 Members
1853 Online
109992 Solutions
New Discussion

monitor cluster utility problem

 
yogeswaran
New Member

monitor cluster utility problem

hi,

When i run the monitor cluster utility the system giving the error message like "Floting point divided by zero" with some integer values..When i run the monitor system it is woking without any problem.i am looking for the solution...

The system details are:
o/s-7.3-2
4 alpha servers with cluster(Memory channel)
13 REPLIES 13
Volker Halle
Honored Contributor

Re: monitor cluster utility problem

yogeswaran,

welcome to the OpenVMS ITRC forum.

Are the 4 Alpha servers in this cluster all running OpenVMS Alpha V7.3-2 ? Do they share a common system disk ?

There seem to be no patches for MONITOR on OpenVMS Alpha V7.3-2, so this may be a more generic problem or specific to your configuration or system load.

Could you provide the exact error message from the screen (screen capture or terminal emulator session log) and attach the file to your next reply ?

Are these errors triggered by a specific system inside the cluster ? You could try MONITOR CLUSTER/NODE=(node1) and so on: node2, node3, node4. Does the error happen for all nodes ? And if you run MONITOR CLUSTER/NODE=(nodex) on another node ? Same error ?

Sorry, just questions, no answers yet ;-)

Volker.
yogeswaran
New Member

Re: monitor cluster utility problem

hi volker,

Thanks for your reply...
All the four servers are connected in cluster.This error message displays in all the four nodes...(>Moni Clus)

The snap shot of the error message give below

%MONITOR-E-UNEXPERR, unexpected error
-SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual
address=00000000005A
0014, PC=000000000006798C, PS=0000001B

i am looking for your reply...

Wim Van den Wyngaert
Honored Contributor

Re: monitor cluster utility problem

Where is the "divide by zero" ?

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: monitor cluster utility problem

May be TCP would work better (then the problem is mc related).

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1028490

Wim
Wim
Volker Halle
Honored Contributor

Re: monitor cluster utility problem

yogeswaran,

an unexpected error is happening in MONITOR. In this case an Access Violation: the instruction at PC=6798C tried to access virtual address 5A0014 and failed to read that address.

Does exactly the SAME error (same PC, same VA) happen on all 4 nodes in your cluster ? Do the nodes share a common system disk ?

You could create a process dump file with:

$ SET PROC/DUMP
$ MONITOR CLUSTER
$ DIR MONITOR.DMP

If the process dump file has been created, try this:

$ ANAL/CRASH monitor.dmp
SDA> EXA/INS value-of-pc-shown-as-failing-PC
SDA> SHOW PROC/PAGE value-of-failing-VA
SDA> SHOW PROC/IMAGE
SDA> EXIT

To determine, whether this problem occurs due to a local problem on a node or due to a problem on a remote node, please also try this:

node1 $ MONI CLUSTER/NODE=node1

and so on for all nodes (use the local node name in each case) - what happens ?
If it doesn't fail, when collecting data from the local node, try:

node1 $ MONITOR CLUSTER/NODE=node2

This is an unusual problem for MONITOR, so please be patient and try to follow the troubleshooting steps exactly. Please also answer all the questions.

Volker.
Volker Halle
Honored Contributor

Re: monitor cluster utility problem

On OpenVMS Alpha V7.3-2, the image name for MONITOR is MONITOR_TV.EXE, so you'll get a MONITOR_TV.DMP dump file.

Volker.
yogeswaran
New Member

Re: monitor cluster utility problem

hi volker,

Thanks for your reply...
This problem is like inconsistant. we can't predict the instance.It will be okay when we reboot the server.

All the four server having seperate system disk(system disk is not common).Now it is working without any error message after the reboot.I have to wait for the next instance.
Once it comes i will try to analyze with process dump.

when we issue the monitor command some other node it is giving the error message like,

%MONITOR-I-ESTABCON, establishing connection to remote node(s)...
%VPM-W-NOCONNECT, Unable to connect to remote node xxx1(node 1)
-MONITOR-W-NODEINIERR, error during node initialization
%MONITOR-I-CONT, continuing....
%VPM-W-NOCONNECT, Unable to connect to remote node xxx2(node 2)
-MONITOR-W-NODEINIERR, error during node initialization


The following is the error message what i have received from the another node is :

%MONITOR-I-ESTABCON, establishing connection to remote node(s)...
%MONITOR-E-UNEXPERR, unexpected error
-SYSTEM-F-IVCHAN, invalid I/O channel

Kindly have a look and give feedback...

regards
yogesh





Volker Halle
Honored Contributor

Re: monitor cluster utility problem

yogesh,

before we further analyze that UNEXPECTED error, we first need to make MONITOR CLUSTER or MONITOR SYSTEM/NODE=xxx work at all in your cluster. MONITOR CLUSTER will try to establish DECnet connections to all nodes in the cluster using their SCS nodenames. All the node names must be defined in the DECnet databases across all nodes in the cluster.

One of your problems is described here:

[OpenVMS] MONITOR CLUSTER fails with SYSTEM-F-IVCHAN
http://h18000.www1.hp.com/support/asktima/operating_systems/009DFF51-4991D1A0-1C0062.html

If MONITOR fails with %VPM-W-NOCONNECT, there is a generic problem establishing a DECnet (or TCPIP) connection to the other node. MONITOR CLUSTER will connect to ALL other nodes in the cluster, so you may see multiple errors.

To test MONITOR connectivity to another node in the cluster, just use MONITOR SYSTEM/NODE=xxx in a step-by-step procedure, until it works to all other nodes in the cluster. Then MONITOR CLUSTER should also be able to connect to all nodes in the cluster.

Which version of DECnet are you running DECnet IV (NCP) or DECnet-Plus (NCL) ?

Try MONITOR SYSTEM/NODE=node2 on your first node. If that fails, try SET HOST node2. If that also fails, there is a problem in your DECnet config.

During those tests, issue a REPLY/ENABLE on one of your terminal sessions, so you get any OPCOM messages displayed.

If you have different system disks, do you have a common SYSUAF ?

Volker.
Wim Van den Wyngaert
Honored Contributor

Re: monitor cluster utility problem

That's why I proposed to switch to tcp.

In addition to doing set host, do

mc ncl show nsp all
mc ncl show osi tra all
or
mc ncp show exec
mc ncp show exec char

to see the current and maximum number of connections.

There could be a resource problem.

Wim
Wim
yogeswaran
New Member

Re: monitor cluster utility problem

Hi volker,

Is there any limitation of running MONITOR_SERVER process cluster????

IF yes what will be the solution for this problem..I could not get you what you are trying to say...

hi wim,

Thanks for your input...
I will try your input also....let see the result..


regards
yogesh
Volker Halle
Honored Contributor

Re: monitor cluster utility problem

yogesh,

if you get the IVCHAN error, you should stop/id the MONITOR_SERVER process on the remote node, because it may have a problem.

The MONITOR_SERVER process can handle multiple incoming DECnet connections, if it works correctly. MONITOR xxx/NODE=nodex connects to nodex via a DECnet logical link to the VPM object (=session control application) on the remote node. If this fails, you'll get a %VPM-W-NOCONNECT error.

What does MC NCP SHOW EXE return on your nodes ?

Volker.
yogeswaran
New Member

Re: monitor cluster utility problem

hi,

while execute the show exe command in mc ncp
the following message shows:-

NCP>show exe

Node Volatile Summary as of 20-JUL-2006 12:32:51

Executor node = xx.xx(node name)

State = on
Identification = DECnet-OSI for OpenVMS

the node id and node differs in each node....



regards
yogesh
Volker Halle
Honored Contributor

Re: monitor cluster utility problem

Yogesh,

so you're running DECnet-OSI (/Plus).

You can check the VPM session control application on all of your nodes with:

$ mc ncl sho sess con appl vpm all

It should exist and point to the image SYS$SYSTEM:VPM.EXE and a User Name of VPM$SERVER. This user must exist in the SYSUAF.

node1 $ MONITOR CLUSTER/NODE=node1

will activate the MONITOR_SERVER process on the local node (using the default DECnet communication in MONITOR), so this should not give any errors. The VPM$SERVER user should have it's default directory in SYS$SYSROOT:[VPM$SERVER] and there should be NET$SERVER.LOG files - any errors in them ?

Once this works on all your nodes, the next step would be to run MONITOR CLUSTER/NODE=(nodex). Before you do this, try SET HOST nodex to see, if the node name is correctly defined.

Volker.