- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- fail over test
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 09:52 AM
тАО04-30-2004 09:52 AM
fail over test
We pulled out one controller from the san box and crashed the database.
We then pulled out the other san controller and crashed the database again.
Then we turned off one of the brocade switches and crashed the database, after that we did the same thing to the other brocade switch and once
again the database came down.
Doing a pvdisplay -v /dev/vg05 and vg06 shows that alternate paths exist so does anyone know why informix, running on HPUX 11.0 would not fail over properly?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:01 AM
тАО04-30-2004 10:01 AM
Re: fail over test
Could you please advice if you are seeing any error messages.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:02 AM
тАО04-30-2004 10:02 AM
Re: fail over test
You're under the assumption that a disk I/O failure should cause a failover. Not gonna happen! MC/SG needs to shut the SW down when it wants to failover & how is it gonna do that if it's I/O is gone? The same kind of rule applies to the CPUs - BUT in that case the box is gonna panic & TOC & THEN the other node will pick up the ball & start up the package. Ditto with power supplies.
Bottom line is that MC/SG is *not* a continously available solution - its *highly* available. For continous you need to spend *far* more $ with HP. They can do it, but not for the price of MC/SG.
My $0.02,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:12 AM
тАО04-30-2004 10:12 AM
Re: fail over test
When disk I/O goes away a TOC is not the reaction - a hang will occur. IF a TOC finally happens - unlikely - then the failover will occur.
SO moral is only failures that cause TOCS or failures of monitored HW resources (hint LAN) will cause a failover.
NOW... there's nothing stopping you from setting up a monitor script that will watch the disk I/O & IF it sees a failure of *ONE* channel then cause a manual failover. But still - even a monitor script can't save you if BOTH channels fail because you'd still need to cause a TOC & you can't do that from the command line.
Rgds,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:25 AM
тАО04-30-2004 10:25 AM
Re: fail over test
and then I am told what chunk is going down
So, we cannot get the O/S to go down different
paths is a controller go out, without spending
more money huh?
By the way we do not run Service Guard here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:30 AM
тАО04-30-2004 10:30 AM
Re: fail over test
I am having a 11.0 server in the exact same configuration you have and it is running pretty smooth without any problems.It is one of our legacy applications which has not been migrated to 11i.
We ahve however had numerous problems with the VA and have stopped using it in favour of a XP1024.
Let me poke around my server and see if I can come up with something you can use.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:31 AM
тАО04-30-2004 10:31 AM
Re: fail over test
You need to check the LVM setup of the VGs that contain that data:
vgdisplay -v /dev/vg_name
there *must* be an alternate link there for LVM to handle the link loss.
If not you need to vgextend that VG to use that extra link.
You should see something at the end of the output like:
/dev/dsk/c5t2d3
/dev/dsk/c9t2d3 Alternate Link
If you don't - you don't have that LVM protection.
Rgds,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:35 AM
тАО04-30-2004 10:35 AM
Re: fail over test
I saw this in one of the forum threads
http://forums1.itrc.hp.com/service/forums/questionanswer.do?admit=716493758+1083364406421+28353475&threadId=215197
We have discovered the problem here.
Linkdown-tmo was set to 60 and no_device_delay was set to 30.
The combinations of these two delays caused the PVLinks to get in such a state that they never failed over.
I waited for one hour until resetting the kernel parameters to defaults.
All is now working properly.
Thanks for everyone's input.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:36 AM
тАО04-30-2004 10:36 AM
Re: fail over test
What you might want to look at is HP's Auto Path VA product. I think that will give you more of the functionality that you want.
http://www.hp.com/products1/storage/products/disk_arrays/modular/autopath/index.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:41 AM
тАО04-30-2004 10:41 AM
Re: fail over test
We do have alternate links already. That is what brought up the whole thing with the fail over test. As mentioned we have everything doubled, switches, disks, controllers ect...
and configured to supposible use alternate links to go to the other side if one should fail. But, our database has gone down twice now because when a san controller fails it does not go to the alternate link it just takes down the database. I thought with the redundency we have in place that it would "fail over"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-30-2004 10:50 AM
тАО04-30-2004 10:50 AM
Re: fail over test
In LV & PV it's the -t value that sets it.
IF it's higher than the SW can tolerate then either the HW or SW value needs to change.
I believe about 90 seconds is the HW default & it may be the SW pukes & dies at that amount. BUT I would caution you to not decrease it unless you're sure loads will *never* cause that long of a delay - i.e. you may want to elongate the SW timeout before you shorten the HW timeout.
Rgds,
Jeff