Operating System - HP-UX
1836374 Members
2316 Online
110100 Solutions
New Discussion

MC/SG OPS with NFS and CIFS

 
Van Poelvoorde
Occasional Advisor

MC/SG OPS with NFS and CIFS

At a customer site we installed a two node OPS cluster. We use the MC/SG for OPS edition version 11.09
On one of the servers we have a CIFS and NFS package running, when the NFS package is switched to the other node, both nodes in the cluster perform a TOC (panic). Has anyone any idea about what could be the problem here?

Thank you ver much
3 REPLIES 3
Printaporn_1
Esteemed Contributor

Re: MC/SG OPS with NFS and CIFS

Hi,

Have you look at the Service guard NFS toolkit ?
enjoy any little thing in my life
Carsten Krege
Honored Contributor

Re: MC/SG OPS with NFS and CIFS

There is a numer of possibilities. You have to check the syslog of both nodes and the package logs to understand the problem.

Since the TOC occurred during package switch, it is likely that an error occurred (e.g. start/halt last longer than RUN_SCRIPT_TIMEOUT/HALT_SCRIPT_TIMEOUT) during halting or starting the package, which leads to a TOC, since NODE_FAIL_FAST was enabled for the package (check the package configuration file).

You could also deal with a kernel hang that starved out cmcld so that it could not update the safety timer that TOCs the box if it expires. In this case you should make sure that you have NODE_TIMEOUT set to 5-8 seconds (recommended value). Kernel patches will be needed to solve kernel hang problems (check with the HP CRC).

Also n/w problems are possible, that prevent heartbeat messages from coming through.

When you have the TOC dump, the syslog and package logs, your local HP CRC will probably be able to figure out the cause. If you don't have them, you should try to reproduce the problem, to make sure that it can be fixed.

Carsten
-------------------------------------------------------------------------------------------------
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move. -- HhGttG
Stephen Doud
Honored Contributor

Re: MC/SG OPS with NFS and CIFS

Carsten has some excellent ideas.

To take what he said one step further, make it a habit to inspect the available logs for yourself. For instance, check out OLDsyslog.log file on both servers to see what they reveal before the TOC.
Was there a race to the cluster lock disk? Did they obtain the lock? If not, then perhaps the cluster lock disk was not available or the cluster lock structure was missing.

Also, review the package control log file on the server where the package was halted first. Did the package halt properly, or was there some resource conflict that prevented the package from halting properly?

The response center encourages users to be self-sufficient whereever possible - it makes them happier with our products :)