- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Disaster procedures
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-13-2004 08:29 PM
05-13-2004 08:29 PM
Disaster procedures
I have an interbuilding cluster of 2 servers and 1 quorum node. They are connected with FDDI (no FC).
Can anyone share procedures (for dummies) to analyze the cluster en to take correct actions ? I'm thinking of network failures, network card failures, route problems, disk failures, fiber instable, fiber failures, VMS unstable, creeping doom, building failure (9/11), double disk failures, triple disk failures (we use shadowing of raid sets with auto spares), etc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-13-2004 10:26 PM
05-13-2004 10:26 PM
Re: Disaster procedures
Are you just looking for DR Templates in general or looking for the actual DR plans that any one has for a OpenVMS cluster
regards
Mobeen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-13-2004 11:08 PM
05-13-2004 11:08 PM
Re: Disaster procedures
Both but by preference the actually used documents. With the exact commands to type.
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-14-2004 12:02 AM
05-14-2004 12:02 AM
Re: Disaster procedures
Unfortunately i would not be able to share the DR Plans that i have. But i would like to point you to some resources that would be helpfull
http://www.calamityprevention.com/
http://www.disasterrecovery.com/
I did take a look at some documents (examples) contained there in.
regards
Mobeen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-14-2004 01:43 AM
05-14-2004 01:43 AM
Re: Disaster procedures
$ mc scacp
to monitor the SCS traffic.
Cockpit Manager will suit your needs http://www.hp.be/CockpitMgr
Amds or availability Manager can help too.
The tools from keith parris will help too
http://h71000.www7.hp.com/freeware/freeware60/kp_clustertools/
regards
Gerard
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-15-2004 02:10 PM
05-15-2004 02:10 PM
Re: Disaster procedures
don't laugh, but the most important commands to
use to check if you have correctly are very simple, show device, DECnet loop, IP ping etc.
As for planning and preparation, first get a clear plan of your current layout on a box, cluster and network level. Do not forget about infratstructure (Power, air conditioning). Identify single points of failure and if possible remove them. Most important, once you have your plan in place, test it. This is an iterative process, as you are likely to find things during testing that you did not think about in before.
Greetings, Martin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2004 09:42 AM
06-07-2004 09:42 AM
Re: Disaster procedures
The easiest and safest way to set up a new disaster-tolerant cluster is to buy the DTCS (Disaster Tolerant Cluster Services) package from HP. That contains all the assistance one needs in terms of design, planning, implementation, testing, training, and documentation. It doesn't add a significant percentage to the cost of the configuration. It includes software to provide monitoring and operational functions and to provide a framework and integration for an entire set of monitoring tools. It provides information on how to operate your DT cluster both in normal conditions as well as various failure scenarios. It helps you avoid common mistakes. Without it, you have to learn from real-life experience, which can be risky.
For full disaster tolerance, you'll need a number of things in place for monitoring and control -- monitoring of all the network links (for this, LAVC$FAILURE_ANALYSIS is very helpful -- see the article "Local Area Network Cluster Interconnect Monitoring" in the OpenVMS Technical Journal, V2, at http://h71000.www7.hp.com/openvms/journal/v2/index.html), console management so you can boot and reboot systems remotely and monitor and record system console output, monitoring in place to detect node failures (since you have no inter-site FC link, you're dependent on at least one node being up and running at each site to provide MSCP-serving access to the disks at that site to keep the shadowsets alive), and monitoring of shadowset membership (this is very crucial).
MOUNT commands in startup procedures need to be carefully designed to avoid untimely or wrong-way shadow copies. Some new /POLICY qualifiers can be very helpful in this area.
Is your quorum node at one of the two main sites, or at a 3rd location? If it is at a 3rd site, then a quorum recovery tool (Availability Manager or DECamds can do the job) is perhaps less crucial than it would be if you have only 2 sites, but should still be in place.
There's a lot of information on disaster-tolerant clusters, including class nodes for a 1-day seminar, at http://www2.openvms.org/kparris/ and http://www.geocities.com/keithparris/
In addition to the ones cited earlier, here are some other general DR/BC resources:
Contingency Planning & Management magazine: http://www.contingencyplanning.com/
Continuity Insights Magazine: http://www.continuityinsights.com/
The Uptime Institute: http://upsite.com/