- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: disaster tolerance on cluster DS20 / RA7000
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 03:40 AM
09-04-2006 03:40 AM
My cluster configuration is:
2 Alphaserver DS20E connected on the shared RA7000 raid enclosure.
I just have a JBOD in my RA7000 and use SHADOWING on VMS.
I want to Install a disaster tolerance configuration in the remote site (800 m)
with the same configuration.
Question: If I install the same configuration on the remote site, and just connected with Ethernet (100Mb), can I mount one 3rd member (of an existing shadow set on the remote RA7000?
Regards,
dominique
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 03:54 AM
09-04-2006 03:54 AM
Solution- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 04:30 AM
09-04-2006 04:30 AM
Re: disaster tolerance on cluster DS20 / RA7000
Read the manuals carefully and check for other resources, there were e.g. some articles in the Technical Journals (http://h71000.www7.hp.com/openvms/journal/toc.html).
regards Kalle
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 04:33 AM
09-04-2006 04:33 AM
Re: disaster tolerance on cluster DS20 / RA7000
http://www2.openvms.org/kparris/
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 04:40 AM
09-04-2006 04:40 AM
Re: disaster tolerance on cluster DS20 / RA7000
I would also use (at least) TWO connections between the sites. And make sure they are along _DIFFERENT_ geographical paths (ALL the way, even inside both buildings.
hth
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 05:58 PM
09-04-2006 05:58 PM
Re: disaster tolerance on cluster DS20 / RA7000
Also verify boot times of routers and switches between the 2 buildings and make sure that the cluster survives them (larger time out). Unless the application can stand that of course (in which case you have to break the cluster and do a shodow copy when it comes back).
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 08:16 PM
09-04-2006 08:16 PM
Re: disaster tolerance on cluster DS20 / RA7000
but beware, you cannot shadow the system disk, each site should have its own systemdisk
Well, not entirely true.
The important part is, to always maintain a consistant state.
We concluded, that to separate system disks on two sites was (for us) not the best guantee for continuous consistency.
We have configured _ALL_ of our nodes to network boot as satellites of any of the others.
That means, initially the system disk will be MSCP served. The moment a direct path to the SAN is established, that dails over.
It _DOES_ imply, that should we ever need a _CLUSTER_ boot, then only manual operation is possible.
Re Wim:
to be redundant, you better take a raid set (mirror or 5) in each building and shadow these
Yes, at least! And to be able to make BACKUPs by removing a member, we want to LEAVE that config, so, except during the copy of that memer to tape, we have 3 members.
And all this extra config complexity is why I was so pleased with the plans to allow more members. But those did not make it to 8.3, so surely any hope of back-porting is meaningless.
hth
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 08:53 PM
09-04-2006 08:53 PM
Re: disaster tolerance on cluster DS20 / RA7000
Dominique : if you consider an interbuilding SAN, consider also that 1 SAN can have problems that bring the whole SAN down (we had it during 15 hours on the PC SAN).
2 SANS with shadowing between them seems the most reliable and stable solution. But not the fastest.
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 11:06 PM
09-04-2006 11:06 PM
Re: disaster tolerance on cluster DS20 / RA7000
I think that it there of a confusion. You will find in attach the configuration that my customer would like.
I don't have a SAN and my systems are connected only by Ethernet.
cordialy
dominique
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2006 11:43 PM
09-04-2006 11:43 PM
Re: disaster tolerance on cluster DS20 / RA7000
As a starter, I noticed VMS 7.3.
I would _MOST STRONGLY_ advise to go to 7.3-2, with latest or near-latest UPDATE patch.
In your proposed config it would be a real pity if you do NOT use HBMM (Host Based MiniMerge), which is only available starting 7.3-2 with HBMM V2 (included in the later Update Patches).
Your picture does not specify HOW the links are located (and you should make SURE they use different geographical pathways: it would be a pity to have it dual configured through 2 adjacent cables that can be dug out by a single digging action!)
I also noted that at both sites they seem to use just ONE network switch? That would introduce 2 SPOFs (SinglePointOfFailure), which is not a good idea for DT.
But: Keep on thinking, keep on designing, keep on asking. _THIS_ is the time when things can be changed rather easily and rather cheaply, later on will be MUCH more difficult, and MUCH more costly!
hth
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2006 12:55 AM
09-05-2006 12:55 AM
Re: disaster tolerance on cluster DS20 / RA7000
I thought of installing the following elements in the final configuration: OpenVMS 7.3-2 with the last patches.
2 network switchs on each site.
The questions which remain unanswered are:
The two sites are located at approximately 1 km one of the others , I am afraid to have problems of blocking (shadow Copy with MSCP Serve) or (cluster Hang) knowing that the only common way which connect them is an 100 Mb Ethernet link RJ45/Fiber.
The configurations which I know are composed of RA8000 (SAN) which are connected between it with a Fibre Chanel. (HSH80 - > SAN Switch - > Multimode fiber - > SAN Switch - > HSG80).
For the Quorum disc which is installed in on the 1st site.
The shadow is not authorize? how to replicate this disk?
Cordialy,
dominique
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2006 01:13 AM
09-05-2006 01:13 AM
Re: disaster tolerance on cluster DS20 / RA7000
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2006 02:14 AM
09-05-2006 02:14 AM
Re: disaster tolerance on cluster DS20 / RA7000
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2006 04:07 AM
09-05-2006 04:07 AM
Re: disaster tolerance on cluster DS20 / RA7000
>>For the Quorum disc which is installed in on the 1st site.
As Wim states, you should consider doing without a quorum disk and configuring votes to either allow your primary site to continue or add a third "quorum site" to allow continuous operation of either site.
I'd also recommend configuring Availability Manger or AMDS (on a non clustered node). Define procedures for recovering quorum "on the fly" now, before there are problems. Carefully planned this can allow you to restore service quickly if a site becomes unavailable. Without planning, this could create a partitioned cluster and data integrity issues.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-06-2006 01:34 AM
09-06-2006 01:34 AM
Re: disaster tolerance on cluster DS20 / RA7000
Another questions
It is possible to add the member of different size in a shadow set .
ex: DSA0 = 1 lun RA7000 (18 GB) + 1 lun MSA1000 (36 GB).
Which is the maximum length advised to use shadow through "MSCP serve" disk
Dominique
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-06-2006 02:22 AM
09-06-2006 02:22 AM
Re: disaster tolerance on cluster DS20 / RA7000
http://h71000.www7.hp.com/doc/732FINAL/aa-pvxmj-te/aa-pvxmj-te.HTML
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-06-2006 06:10 AM
09-06-2006 06:10 AM
Re: disaster tolerance on cluster DS20 / RA7000
Re quorum disk:
"Quorum disks are a bad idea. One thing worse is a 2-node cluster without QDsk, and the worst are non-VMS clustering solutions".
Seriously, you NEED a quorum referee, and ANY active node (even the smallest Vax station) is much superior to a quorum disk.
A quorum disk can not, and should not, be shadowed, and certainly not inter-site.
Think of this simple scenario:
Both sites equal votes, with shadowed Qdsk.
Intersite link breaks. Now EACH site has half the active nodes, PLUS, his (now single-member) quorum disk = QUORUM. So, each site continues independently, each (now independantly) modifying ITS members of the data shadow sets. How quickly would those be significantly different at your site? And now, for the final touch, the intersite link comes back.....
That is also why it is VERY UNWISE ( = downright stupid) to use the hardware capability created by multisite SAN controller-based replication.
VMS has no way to know of this, so also no way of preventing it, and that is the dangerous part.
The available solutions are
1. _THREE_ active sites (some 9/11 companies with one in each tower, plus a third somewhere else, are still in business BECAUSE of that, but try to sell THAT to the average, or even above-average, management!)
2. A quorum site. Technically also 3 sites, but one of those can be just a locked closet housing, say, a DS10, somewhere along one of your cabling routes
3. Declare one site "Primary". During normal operation that site has one or a few more votes than the secondary, and it will be the site that continues when the links disappear.
(this is what we are using)
4. Choose for equal-weight sites. Now ANY intersite breakup needs human evaluation, and human intervention. It may well be a wise choice, but you have to make _DAMN SURE_ that _ANY_ human that _MIGHT_ be on duty when it happens, is _VERY SURE_ about what to do, or who to warn, before doing anything.
The already given suggestion of implementing AM or AMDS I fully endorse!
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-06-2006 07:24 PM
09-06-2006 07:24 PM
Re: disaster tolerance on cluster DS20 / RA7000
I now have all the elements to configure the new cluster and use shadow as well as possible.
Cordialy,
dominique