- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- disaster test - what to check?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-16-2009 11:22 PM
10-16-2009 11:22 PM
In two weeks or so we will have a disaster test in our datacenter, and I need to check some things and prepare for it. The power will suddenly turned off and after a few minutes turned back. We use ServiceGuard and external XP storage, the outage affects only the half of the two-node clusters and only the half of the storage subsystem. I need to foretell what will happen with the system/packages, and how the resynchronization of LVM will be done - how will state between the two storage boxes be synchronized.
The SG part is clear for me, but for now I didn't do any storage yet so I am most curious about the storage part here. And if you have a disaster recovery plan here, it is welcomed too! Points will be awarded. ;)
Unix operates with beer.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-16-2009 11:59 PM
10-16-2009 11:59 PM
Re: disaster test - what to check?
I do not know what you are looking for from the storage side if the power goes down. Usually, for XP there is a disaster recovery software called continuous access. Is the same implemented?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-17-2009 12:17 AM
10-17-2009 12:17 AM
Re: disaster test - what to check?
a) Have your unlimited power supply vendor out to check for bad batteries.
b) and verify all boxes are on battery backup
c) then there is nothing. You're not failing over unless the network is disrupted. So if all of your network nodes are on batteries...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-17-2009 05:25 AM
10-17-2009 05:25 AM
Re: disaster test - what to check?
MP Login for all the Servers
CM>PS
UPS,
====
Check with UPS Vendor,
Backup
======
I would also like to make sure all latest
OS backup and latest File System backup
for all servers included under disaster test
nickel script to collect all the system information details.
Rgds,
Johnson
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2009 03:40 AM
10-18-2009 03:40 AM
Re: disaster test - what to check?
My question is: how will be the filesystems synchronized? If the package failovers, I think the resync of LVM will be initiated by the surviving clusterpartner, where the package actually runs. But what if the failover isn't permitted? e.g. after reboot of the powered off node the package starts automatically. In this case the sync was initiated by the rebooted node. I'm afraid here will be the correct data overwritten by the stale one. Can LVM auto-resync turned off? With what command?
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2009 03:43 AM
10-18-2009 03:43 AM
Re: disaster test - what to check?
No, HP XP Continuous Access isn't implemented.
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2009 04:36 AM
10-18-2009 04:36 AM
Re: disaster test - what to check?
Still we are not clear on what you are looking for? If you are going to perform a power OFF test in the XP and test the disaster recovery of the same, I should say to ensure a good back up. Nothing else ....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2009 04:57 AM
10-18-2009 04:57 AM
Solutiondidnt see your post on the above.
I do not know the current setup. But normally, if you power off one node of the cluser, the package should automatically change over to the other node. If a filesystem correction is there, the fsck will be called automatically by the other node.
Just for your info, I will share one of my experience. In the site, the power is failed. The storage (not XP) and one of the nodes got powered off suddenly. They came back after a while. But the cluster failed to start and fsck was consuming hours to repair a file system. Finally, a complete shutdown and proper restart of full setup solved the issue without any further delay. No fsck doen this time. So again do not go by docs or information from other sites. The power down test will be unpredictable and as per me, there is a rare chance that you won't have a problem after that, whatever may be your precaustions.
Again, please stick to back up before the activity since the machience may not be aware you are just testing ;)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-19-2009 01:31 AM
10-19-2009 01:31 AM
Re: disaster test - what to check?
Thanks for your help. I know the Serviceguard part: if the package AUTO_RUN is enabled then the package will failover to the surviving node. (For the test packages it isn't always enabled) We have several LVM-mirrored filesystems, they are mirrored with the help of PVGs. Every such physical volume group consists of LUNs from separate XP boxes, so in case of a storage box failure only the half of the mirror will be affected.
But that part isn't clear to me: after a package switch, the package runs with half of the mirror. After the other half of the system powered on, how will be the data synchronized? The surviving node will a sync initiate, but we must make sure that the failed XP will be synchronized to the surviving one, and not reverse..
And what will be with the test packages? They will be started on the surviving node, after it has come back to life. Will here be needed a sync? Or only an fsck? We are using VXFS filesystems, do you think a full fsck (nolog) would be recommended?
After all, we will create an extra backup of the OS and the data...
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-19-2009 11:26 AM
10-19-2009 11:26 AM
Re: disaster test - what to check?
When the failed node is up, then it is just as usual system boot up. You can bring those failed processes can be brought back manually to the original node.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-30-2009 09:09 AM
10-30-2009 09:09 AM
Re: disaster test - what to check?
and could you point me towards, what will on the storage side happen? The one side of the mirror (on storage box) will be out of electricity, my question would be: what happen after restarting the failed storage? How will the resynchronization occur? Will it happen manually or automatically? Could we set this synchronization to manual?
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-30-2009 09:13 AM
10-30-2009 09:13 AM
Re: disaster test - what to check?
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-30-2009 09:23 AM
10-30-2009 09:23 AM
Re: disaster test - what to check?
will the server powered ON again after we give the electricity back?
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-30-2009 07:12 PM
10-30-2009 07:12 PM
Re: disaster test - what to check?
Are You referring with "raw power" or UPS power" or You are doing power mantiance test ?
Once you shutdown the server- gracefully (shutdown -hy 0)
You have to unplug the "Power Cables" from Server- does your power source comes from (UPS) ?
Once you have Completed your power mantinance activity- connect back the power cables once power resume you need manual power on
> will the server again powered off after we give the electricity back?<<
This Question depends on "Power source" - if poweroff, you can check
can Check under Console logs (E - Error logs ) (MP/GSP)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2009 04:14 AM
10-31-2009 04:14 AM
Re: disaster test - what to check?
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-03-2009 12:58 PM
11-03-2009 12:58 PM
Re: disaster test - what to check?
If you have UPS, then it can supply power for a few minutes and you don't need to worry about it either.
But if one server and half of the storage is down, then everything related to that half of storage will be down for sure and the database might be corrupted. Mean while the processes resided in the failed system will fail over to the the survival system. Usually the fail over takes several minutes and if the original failed system is back, then the result is unpredictable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2009 07:22 AM
11-04-2009 07:22 AM
Re: disaster test - what to check?
>and if the original failed system is back, then the result is unpredictable.
Yes, that's my task: to predict the unpredictable. This whole action was organized only for checking what would happen if... electricity would be off all of a sudden. :(
I will leave this thread open and share the details. The test will be made on 27-29 november...
Unix operates with beer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2009 07:48 AM
11-04-2009 07:48 AM
Re: disaster test - what to check?
That you also know that your backups are running and is possible to restore ;)
...
Tip:
Ensure that they really cut the power for all components at once. We found a plausible case for error if failures came in a specific sequence...
Tip:
Check your MC/SG setup, that when your primary node once again comes up again - if it will switch back or not.
It might be that you want a controlled fallback to primary node, and do not want package switching automatically when the power is back again.
/2r
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2010 05:26 AM
03-05-2010 05:26 AM
Re: disaster test - what to check?
Unix operates with beer.