- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- testing Serviceguard
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-01-2005 02:59 AM
тАО12-01-2005 02:59 AM
testing Serviceguard
For CPU, I can issue shutdown command while the package is running on this server, but, what about anything like "power down" server test. I don't think we should perform that test, because it is dangerous for the server, right?
For memory, what we can do test on ?
Thanks,
Roger
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-01-2005 03:20 AM
тАО12-01-2005 03:20 AM
Re: testing Serviceguard
You could also try to force a TOC by killing the cmcld process.
There is no real test you should do for the memory
The OS should take care of that, and HPMC or panic the box (same as a TOC really)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-01-2005 03:44 AM
тАО12-01-2005 03:44 AM
Re: testing Serviceguard
While yanking the power cord is not intended to be a replacement for the shutdown command; it almost all cases the system will reboot almost normally. Bear in mind this is exactly the kind of event that can happen in real life. It also is a good test of how well applications like databases survive a crash.
If memory problems aren't severe enough to panic the machine then they shouldn't be a problem; if they are severe enoufg to induce a panic then the box will TOC and you are essentially back to yanking the power cord. There is really no way to test for bad memory.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-01-2005 05:09 AM
тАО12-01-2005 05:09 AM
Re: testing Serviceguard
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-01-2005 05:23 AM
тАО12-01-2005 05:23 AM
Re: testing Serviceguard
In HP education classes failure is simulated by cutting off the heartbeat lan cable, ie unplugging it.
You induce split brain syndrome and make sure one of the two systems does a TOC, which is a pretty hard crash.
You need to test loss of access to shared disk while the cluster is running normally to make sure packages configued to fail over, actually do and run correctly.
You can do a user test and fail a node using the methods above and see what kind of delays they experience.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-02-2005 02:44 AM
тАО12-02-2005 02:44 AM
Re: testing Serviceguard
I have two HBA cards to the SAN shared storage. As first step of the test, I am going to disconnect all two cables, at this point, I would expect the package would fail over to the second node, but what exactly caused the fail over, I would like to know a little bit more in depth. Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-02-2005 04:24 AM
тАО12-02-2005 04:24 AM
Re: testing Serviceguard
A lot of people decide not to setup an EMS monitor for the disk access as:
i) you have 2 connections to your data, so its not a single point of failure anyway.
ii) If you do use the EMS disk monitoring you have to set NODE_FAIL_FAST_ENABLED to YES - which means on a package failure the node just TOCs rather than trying to do a gracefull shutdown/switchover of the package. Why is this? Well if you think about the situation where all disk IO is effectively hung because of outstanding IOs, there's no way that the package stop process is going to be able to stop your application and unmount the filesystems, so the only thing to do is a hard reset (TOC).
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-02-2005 04:33 AM
тАО12-02-2005 04:33 AM
Re: testing Serviceguard
And even if someone was taught in a class to remove cables to simulate a node failure that is really not valid. Consider the case where the redundant LAN connections have been yanked but both nodes in a 2-node cluster can access all the disks. The behavior is unpredictable because the box with the cables yanked could very easily be the one that acquires the lock -- so that you have an unusable cluster.