Query about HP Cluster switching

Abdullah Siddiquey · ‎09-23-2010

Dear,

I have query about HP cluster switching.

I have two node cluster share same disk array(one disk array). Server model: rx7640.

Every server has two FC connection with disk array.
If I unplug one FC from active server, its working.
But if I unplug two FC connection from the active server, cluster is not switching to stand-by node. all application and database hanged.

Is it normal? or cluster should switch?

melvyn burnard · ‎09-23-2010

Serviceguard does NOT monitor or "manage" storage links by default.
You could use EMS monitors, or new with SG A.11.20 is the built in LVM Monitor tool.

so what you are seeing is the correct behviour without having configured any form of monitoring

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!

Abdullah Siddiquey · ‎09-23-2010

But my cluster monitor database and database is insatlled in disk array. and database was showing off-line when disconnected disk array. But still cluster disnot switch.
in cluster log, I didn't find any information about this database or disk array.

melvyn burnard · ‎09-23-2010

when you say "my cluster monitor database and database is insatlled in disk array. and database was showing off-line" are you using the Oracle Toolkit from the Enterprise Cluster Master Toolkit?
If so that just checks that the database processes are running, it does NOT check for a hang unless you have the new 11.20 Serviceguard with ECMT v 6.0

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!

Abdullah Siddiquey · ‎09-23-2010

I am using Informix database 11.50.

Actually what happen when cluster switch?
If server can not find the physical disk(since FC is disconnected), will it be able to de-activate VG?

If SG failed to deactivate VG from active node, will it able to switch cluster to standby node?

Rita C Workman · ‎09-23-2010

I don't know Informix, we're Oracle, so I'll have respond thinking that way.

Scenario:
rx7640 with O/S on internal disk.
2 fiber connections to array where Informix database housed within package.
Both fiber connections die....

Since your O/S is on internal disk, your server will remain up and in the cluster.
Since your package just lost all it's connectivity to disk - your package will fail. Since you went down hard your database went down hard.

The volume group was set to exclusive at the time of failure, and MC/SG is controlling it [ remember vgchange -c y and vgchange -a e ]. So don't worry about a volume group being set to not active (vgchange -a n).
The package, if it was 'enabled', should start up on it's failover node.
Check cmviewcl and see where the package is and if it was enabled.
Or you could edit the package.cntl file - and REM out the startup of the database first. Save the file. Then manually run the command to start the pkg on the failover node allowing just the mountpoints to come up. And have your DBA's do some kind of database recovery and start it manually.

Just a couple thoughts. Hopefully some MC/SG guru's will pipe in.

Rita

Serviceguard for Linux · ‎09-23-2010

Melvin is an SG guru.

Serviceguard only forces a failover when a monitor within the package detects a failure. In the simplest case, the monitor is checking to see if the process is still running. So, if disconnecting shared storage does not cause the process to fail (disappear from the ps list), then the package does not see that as a failure.

The disk monitor that Melvyn referenced is set up as an additional monitor in the package. That way, if access to shared storage is lost, then that is detected as a failure by the package.

The Oracle toolkit has more "advanced" monitors than some others.

You can add a monitor that checks for the database "offline" and use that to cause a failover.

Thomas J. Harrold · ‎11-15-2010

To detect a database issue, I've found that it's best to write a very simple SQL query that returns an expected result. You can easily create an additional cluster service that will actually check the health of your database...as opposed to just process checking.

I learn something new everyday. (usually because I break something new everyday)

Stephen Doud · ‎11-16-2010

As Melvyn stated, Serviceguard does not have a standard detection mechanism for defunct fiber channel paths to disks. So in fact, Serviceguard is blind to such failures unless, as he stated, you create an external monitor or use SG 11.20 LVM monitor to detect such problems, and respond when a failure event is detected.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Query about HP Cluster switching

Query about HP Cluster switching

Re: Query about HP Cluster switching

Re: Query about HP Cluster switching

Re: Query about HP Cluster switching

Re: Query about HP Cluster switching

Re: Query about HP Cluster switching

Re: Query about HP Cluster switching

Re: Query about HP Cluster switching

Re: Query about HP Cluster switching