1830892 Members
2995 Online
110017 Solutions
New Discussion

cluster testing

 
kevin poole_1
Frequent Advisor

cluster testing

I am new to clusters and unix, is there a script to power cycle the box and test hardware components for failure before distribution to consumer?
5 REPLIES 5
Enrico P.
Honored Contributor

Re: cluster testing

Hi,
there isn' t script that help you to simulate it, you can simulate node/lan/fc failover manually and test the redundance cluster behaviour.

Primary Node Failover: reboot -q (check the package switch on the Adoptive Node with cmviewcl command and syslog.log file, and the return of the node in the cluster; check if the application are working also)

Repeat this test from Adoptive to Primary after enable the packages switch.

Primary Lan failover: unplug primary lan cable (check the lan switch on the standby lan from syslog.log file and netstat command and return to the primary when you re-plug it)

Primary Disk Link failover (if you have Alternate): Unplug Primary Fibre Channel Link (check the switch on the alternate FC from syslog.log file and return to the primary link when you re-plug it)


Before the test you need to check the cluster configuration:

-Packages switching/RUN enable
-Lun configuration (primary and standby)
-Disk configuration (alternate link configuration in the vg)


Enrico
Stephen Doud
Honored Contributor

Re: cluster testing

There is no automation script that tests Serviceguard clusters.
Forcing a server to power-down must be done via the Service Management Processor or performing a manual power-down.
Other testing must also be performed manually.

Serviceguard does not normally monitor and react to disk failures. An EMS monitor for /vg/vg01/pv_summary (choose your VG) must be created and linked to a package RESOURCE in the configuration file if this extra dependency is desired.
A. Clay Stephenson
Acclaimed Contributor

Re: cluster testing

You aren't going to like this but the best technique is to literally yank the power cord(s). If this makes you nervous then you haven't done your job well enough (and made your boxes robust enough). Graceful shutdowns are not what your cluster is going to see in the real world so your tests should be just as brutal.
If it ain't broke, I can fix that.
Torsten.
Acclaimed Contributor

Re: cluster testing

Kevin,

you can first test LAN and FC failover by pulling the active cables but finally, your active node have to die. A good possibility is to go to the GSP/MP prompt and perform a reset (RS command).

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Thomas J. Harrold
Trusted Contributor

Re: cluster testing

I agree that testing should resemble real-world failures, but now that most servers have GSP/iLO/LAN Console, you can do a power-cycle without having to touch the server.

One thing not to forget! Pull ALL LAN connections, so that NO heartbeat packets can be passed between the nodes. Make sure that your quorum device (either a disk or server) works, and that the cluster forms on one node, while the other node TOC's

-tjh
I learn something new everyday. (usually because I break something new everyday)