1843979 Members
1702 Online
110226 Solutions
New Discussion

Re: cluster lock disks

 
SOLVED
Go to solution
Scott Dunkley
Regular Advisor

cluster lock disks

Whoops just posted this on the wrong forum so re-posting:

Hi,

After performing some tests on a 2 node cluster recently I found a problem with what appears to be cluster lock disks.

the lock disk is in a DS2405 on siteb, when I lose that ds2405 there is no issue and the package stays running on siteA as expected, however when I lose the power to thw whole of site B i.e. the server goes down, the server at siteA panics and says it cant find the cluster lock disk crashing the server and the package.

I tried configuring a 2nd lock disk but it compalined about the disk device names not being consistent across the servers. So I changed the controllers around so they were consistent, redone the vgimport and then reran cmquery only now it has no lock disk parameters at all.

My Questions are:

1. why did the server at site A not panic when the ds2405 on site B was lost but did panic when the power was lost to site B?

2. why cant I confifgure a 2nd lock disk in the ds2405 on site A to get round this?

3. why when I change the controllers around does cmquery not pick up a lock disk at all?

Thanks in advnace.

Scott.
Better to regret something you have done, than something you havn't
6 REPLIES 6
melvyn burnard
Honored Contributor
Solution

Re: cluster lock disks

1. why did the server at site A not panic when the ds2405 on site B was lost but did panic when the power was lost to site B?

The system would not panic, as it still has contact with the node on that site, and hence it is not trying to reform a cluster. It DID panic when it lost contact with the second node, as it tried to reform a cluster, and had a 50% quorum, which REQUIRES a lock to allow it to stay up, as it could not get to the disk, it was unable to get the lock, and hence did as it is supposed to and TOC'ed.

2. why cant I confifgure a 2nd lock disk in the ds2405 on site A to get round this?

You should be able to configure a second lock on the site A disk, you need to check your LVM configuration, ensure you have chosen a disk that is the same as seen from both sides, and specify this in the cluster ascii file.

3. why when I change the controllers around does cmquery not pick up a lock disk at all?
Because your LVM /device file association may be incorrect?


What I would possibly suggest is that you maybe look at using the free Quorum Server product for ServiceGuard. You owuld need ot be on at least version 11.13, and the product is available at http://www.software.hp.com

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Scott Dunkley
Regular Advisor

Re: cluster lock disks

Cheers again Melvyn,

could I activate the volume group using the
-q n arguemnet of vgchange then? thus it wouldn't need quorom to form the cluster?

All the LVM parts are correct as far as I know, I set it up the standard way which was to vgexport the vg from first node, mkdir /dev/vgxx then mknod the group file. Followed by a vgimport. The only issue I could see was that the controllers had been plugged in the opposite way round on site B, would this cause a problem?

why when I swapped the controllers over didnt cmquery return the need for a cluster lock disk?

I'll have a look at that quorom software now.
Better to regret something you have done, than something you havn't
melvyn burnard
Honored Contributor

Re: cluster lock disks

Not sure I understand the question:
could I activate the volume group using the
-q n arguemnet of vgchange then? thus it wouldn't need quorom to form the cluster?

You do not have to have the vg activated when the node goes for the lock disc.

When you go to create the cluster ascii file, it NEVER puts in the second cluster lock disc, this is a manual addition.
You may find it better to have all the shared vg's activated on the node where you ae doing hte configuration, just to make sure you can see these discs.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Scott Dunkley
Regular Advisor

Re: cluster lock disks

sorry my mis understanding. Writing before thinking as usual.

I have looked the quorum server software and I cant use it in this instance as they only have the 2 HP servers, I could set up al linux server I suppose but seems more hassle than its worth. Seems an expensive way of cluster locking.

I understand that a second lock disk would be manually configured but what is happening is that if I change the controllers over, effectivly making both servers use the same PV device names, I run cmquery it it doesnt even have a first lock pv defined.
Better to regret something you have done, than something you havn't
melvyn burnard
Honored Contributor

Re: cluster lock disks

As I do not know these disc units at all, I am not sure why or how this might be happening.
I would normally not play around with controllers on this type of disc as I have heard things can get confused.

I would say start with basics and get each step set up from there.

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Stephen Doud
Honored Contributor

Re: cluster lock disks

An internal HP website that keeps us informed on "supported" disks in a ServiceGuard environment has this to say about the DS2405:

The DS2405 is supported in HA configurations at 1 Gb and 2 Gb Fibre Channel speeds. Currently, only Fibre Channel arbitrated loop configurations, using direct connect is supported (no FC Switches, hubs or Fabric support) for the DS2405). ... This limits the use of the DS2405 to 2 node clusters, and since there can be no alternate paths to the DS2405, you must mirror package data between two separate DS2405 enclosures.

As Melvyn stated, SG doesn't need an activated cluster lock VG to get to the lock structure in order to reform a cluster where only 50% of the nodes are responding with heartbeat. Nor does SG need to see the lock structure to remain running! Not seeing the lock structure on the cluster lock disk leads to this sort of hourly message in syslog.log:

WARNING: Cluster lock on disk /dev/dsk/c2t5d0 is missing!

Also, the disk special files on both nodes need not match each other to work with SG. They merely have to link to the intended common (shared) disks. Because matched special files across nodes is not likely to occur, the -s option was added to both the 'vgexport' and vgimport' commands to aid the ServiceGuard sys admin load the /etc/lvmtab file with the intended shared disks per volume group.

If you have moved disk controllers to different backplane positions to try to match the special files, after reboot export and reimport (using that -s option with map files) the volume groups to update lvmtab.
Also, remember to clean up (rmsf -H ...) the UNCLAIMED special files that used to link to the disks.

Hope this helps
StephenD