Begin new to Tru64, what do I need to know and be aware of for a two node 5.1b cluster where the common cluster root, usr, and var disks are mirrored (with LSM) between to different fibre channel connected storage subsystems? Node 1 will have its boot disk on subsystem 1, Node 2 on subsystem 2.

Application data disks are also LSM mirrored between the two subsystems, but, the quorum disk will be on a third shared storage box.

If one subsystem falls flat, I expect one cluster member to panic, and the other member to carry on and cluster applications to fail over to it. Just looking for a general "this is what the docs didn't say, but what experience has shown" kind of reply. Thanks.
Don't you have some kind of mirror/replicate service between the storage systems? If you have i would choice that, instead of the (non-supported) LSM sollution yer building now.

You may test this setup a 1000 times, but thats still no proof that it will always work.

We have the very setup that you are talking about trialling and having been running it for over a year. We have 3 machine rooms, each with an RA8000 inside it. Inside 2 of the machine rooms we have an ALPHA ES45 (4 CPUs). We use GB-Ethernet for the cluster interconnect and also operate two separate fabrics for the SAN (each ES45 has two HBA cards and also each RA8000 has two controllers).

The cluster system disk is protected through LSM (fully supported by HP), there are just two commands to do this (after you've disklabel'd the disks) : -

volsetup disk1 disk2

on primary machine and then :-

volsetup -s

on secondary machine.

All is documented and works gr8, although I then used lsmsa to change the names to something more meaningful. The details for LSM mirroring etc, are in the HP documentation...

The quorum is on the 3rd RA8000...

I don't get the meaning of a sub-system failure, if an RA8000 goes down then your system should continue (as LSM is mirroring the cluster disks). The only way I get a PANIC to occur is to pull out the cluster interconnect (after taking bets on which node will get the quorum disk first).. One will get there.. the other gets the "Yielding to foreign owner"...

Only things I'd recommend: -

1. Ensure cluster interconnect is GB and on its own separate fabric.

2. Zone your SAN switches.

3. Access control your HSG80's etc (ACL's)

Hope this gives you some confidence...
Your input is a "little bit" different than that of Gary's. I would appreciate if you would say why an LSM solution is not a good one.

My reference to one member PANICs is in this case the boot disk for each member is on one or the other storage box. So, if one box goes bye-bye I assumed one member losing the boot disk still needs access to it, and can't, and checks out.

Please educate me if this view is not correct.

Reading the relies, another question popped up. Has, or does, anyone done this with non-HPQ storage devices?
to john tolnay,
i would make it like this:

take two storage boxes, member1 should boot on the first box, member 2 on the secound box. shared cluster root, usr, var and all other data disks mirrored with LSM and a small quorum disk on each storage box (size 1MB).
one quorum disk is always online and if the quorum disk is damaged take the other storage and make an new one with the clu_quorum command.

if the whole storage box is damaged you can only boot one clustermember.

