Operating System - HP-UX
1832858 Members
3481 Online
110048 Solutions
New Discussion

Need help with Cluster Question

 
Sushil Singh_1
Advisor

Need help with Cluster Question

Hi All,
I have node1 and node2 in a cluster env. I have a application in C and oracle database running on node2. Both are suppose to be totally identical, system wide. But for some reason when we failover to node1, our application fails completely and goes to node2. Oracel is on shared volume group so we know that its not a problem and out application works on node2 so we know that application is not a problem. Question is

Is there any way to compare OS or something to see what is different on these both machine? Hardware is totally identical.

Thanks
Sushil Singh
6 REPLIES 6
Sanjay_6
Honored Contributor

Re: Need help with Cluster Question

Hi Sushil,

I do not think there is any way to tell if the OS is completely identical. The differnece could be a simple kernel parameter. We can compare the software and the patches on the system but it is still possible to miss something. Aren't you getting any error message in the cluster package log file which could help you ientify the problem. Check the package log file at /etc/cmcluster/pkg_name/xyz.log.

Hope this helps.

Regds
A. Clay Stephenson
Acclaimed Contributor

Re: Need help with Cluster Question

Well, about as close as you can get is a swlist on both boxes piped to diff BUT a well-written application should be able to tell you exactly what is wrong by logging the error.
If it ain't broke, I can fix that.
Stuart Abramson_2
Honored Contributor

Re: Need help with Cluster Question

As stated above, check the package control log for errors.

Check that your cluster is identical on both sides: same disk volumes, same LVs, same everything.

What is your normal configuration? Appl and Oracle both run on node 2? What runs on node 1? Then when you switch both Appl and Oracle DB to node 1, Appl fails back to node 2? Are the Application files on shared disks?
Carlos Fernandez Riera
Honored Contributor

Re: Need help with Cluster Question

How do you force the fail of node2? Is it up while you try to switch the package on node1. FAILBACK parameter can establish the policy to MANUAL or AUTOMATIC. If the failback is set to automatic and node2 is up the package will return to its primary node.

If you suspect too much on the software, I think the best solution is a make_recovery of node2 applied to node1. As the hardarwe is the same you will get a exact copy of the software, or more exactly a clone of system, including systems name and ip addresses.
unsupported
Rita C Workman
Honored Contributor

Re: Need help with Cluster Question

Follow the suggestions for comparing your systems....but...

To make sure my failover ability on a package is working the way it should, I like to get some system time and do manual stop & starts of packages on the different nodes.
If it won't come up right manually....it won't come up right when it automates the failover.

Usually when it doesn't come up right, there is something mis-configured and doing it this way I can 'clean up' anything that isn't exactly as it should be.

Just a thought,
Rita


At least that's what I like to do...

Re: Need help with Cluster Question

Actually there is a tool out there that does this kind of job. It's called the Cluster Consistency Monitor (CCmon):

http://h40045.www4.hp.com/data/ccmon-service-brief.pdf

It always seems to be sold as a service, but it is in fact a product that comes unlicensed with ServiceGuard Extension for SAP (although as far as I can tell you can run it in a non-SAP cluster).

I'm not sure if you can just buy the product on its own without the services (or if you even want to!). This is one of those products HP seem to do the best they can to keep under their hats.

HTH

Duncan

I am an HPE Employee
Accept or Kudo