Serviceguard
cancel
Showing results for 
Search instead for 
Did you mean: 

Serviceguard configuration problem with cluster lock

kaushikbr
Frequent Advisor

Serviceguard configuration problem with cluster lock

Hi Experts

I'm trying to configure a 2 node serviceguard cluster on a Redhat Linux 'Linux esuk1ds1 2.6.18-53.el5xen #1 SMP Wed Oct 10 16:48:44 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux' machine. We are using MSA 500 shared storage and Smart Array 642. We have created a logical disk in the shared storage and this disk is used for cluster lock. We are also using the NFS toolkit. We have decided to use the Modular package configuration. I have configured the cluster and the packages. They all seemed to work, but when I try to start the serviceguard package, I get the following messages in the package logfile ' Incorrect metadata area header checksum
Found duplicate PV z2TqVBsY9m5ha5PVroCjdRN0R4xKVzzZ: using /dev/cciss/c1d1p1 not /dev/cciss/c0d1p1
Incorrect metadata area header checksum
ERROR: sync rmtab: exported file system is not a mounted volume group.'

I'm stuck at this point and not sure how to proceed.

Thanks in advance for all your suggestions

Regards
Kaushik
8 REPLIES
Serviceguard for Linux
Honored Contributor

Re: Serviceguard configuration problem with cluster lock

By the message "found duplicate PV" it seems that there is some problem with the volume group set up.

One question that comes to mind immediately is, did you try to put a cluster lock on a logical volume? If so, that is not supported.

Also, do you have dual paths? Is so, what are you using for multipath? With MSA500, only MD is supported.

Are the LUN names the same on both systems? For LVM I don't think this should matter, but I wouldn't guarantee it.

Did you do the VG import on teh second system correctly?

These are just a few questions that come to mind when I look at teh error. Hopefully one of them will give you an idea.

Otherwise you may want to post more detailed info from the, including LVM listings from both nodes
kaushikbr
Frequent Advisor

Re: Serviceguard configuration problem with cluster lock

Hi

Thanks for you response.

It turned out my colleague who was working on this cluster had run pvcreate on this LUN, and that was the reason for the 'Found duplicate PV' error messages. I did a pvremove -f on this lun and the 'Found duplicate PV' messages are gone. However I have have new issue now. When I re-apply the configuration file using cmcheckconf I get

Checking for inconsistencies
/dev/cciss/c1d1p1 is not a valid device. Validation failed.
Invalid data for cluster lock LUN configuration

error messages.

I've checked on the forum in the following threads..

http://forums11.itrc.hp.com/service/forums/questionanswer.do?admit=109447626+1223476350730+28353475&threadId=1036874

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1033536&admit=109447626+1223476394942+28353475

http://forums11.itrc.hp.com/service/forums/questionanswer.do?admit=109447626+1223476463617+28353475&threadId=1212699

I dont seem to have any of these issues.


Thanks in advance for all your valuable advice

Regards
Kaushik

skt_skt
Honored Contributor

Re: Serviceguard configuration problem with cluster lock


"Are the LUN names the same on both systems? For LVM I don't think this should matter, but I wouldn't guarantee it."

different dev names on the nodes shoud not matter

"Checking for inconsistencies
/dev/cciss/c1d1p1 is not a valid device. Validation failed.
Invalid data for cluster lock LUN configuration"

Looks like the device(or the VG it belongs to) is not cluster aware.
Serviceguard for Linux
Honored Contributor

Re: Serviceguard configuration problem with cluster lock

If the LUN you are trying to use for Lock LUN was ever used with LVM that may cause a problem. Try re-initializing the LUN and make sure the type is 83.
kaushikbr
Frequent Advisor

Re: Serviceguard configuration problem with cluster lock

Hi

I have re-initialized the LUN and the problem has disappeared. I have now managed to use this as the cluster lock. I'm now stuck at the next hurdle. Not very experienced in Linux, I am dependent on suggestions and documents. All the suggestions made by all you people has been very helpful.

I have managed to create the cluster and the packages. However when I try to start the package, the packages fail to start and get the following error messages in the package log file..

ERROR: sync rmtab: exported file system is not a mounted volume group.'


Here is a sample package configuration file

Serviceguard for Linux
Honored Contributor

Re: Serviceguard configuration problem with cluster lock

There is nothing obvious in the file.

I don't have any real information but I'm guessing that there may be a problem with how hosttags are set up or with the volume grups being activated.

So first make sure the volume groups can be mounted on each system. Then make sure thye are not activated and that the hosttags are not set.

If that doesn't help another debug technique is to manually activate the VGs and see if you can start NFS without Serviceguard.
kaushikbr
Frequent Advisor

Re: Serviceguard configuration problem with cluster lock

Hi

First of all thank you all for your valuable suggestions. I found that the XFS configuration in the configuration file is HP-UX format and is not supported on Linux.
I modified the XFS configuration as shown here and that fixed the problem. I created a netgroup by editing the /etc/netgroup file and exported the filesystems to this netgroup. Now the filesystems are mouting OK and are exported to all the hosts.

The 'exports' command syntax is totally different on Linux.

Thanks and Regards
Kaushik




Problematic XFS configuration
=============================
XFS "-o root=esuk1ds1-p:esuk1ds2-p:nfslocal:esuk1man:esuk1010:esuk1011:esuk1012:esuk1013:esuk1014:esuk1015:esuk1016:esuk1017:esuk1018:esuk1019:esuk1020:esuk1021:esuk1022:esuk1023:esuk1024:esuk1025:esuk1026:esuk1027:esuk1028:esuk1029:esuk1030:esuk1031:esuk1032:esuk1033:esuk1034:esuk1035:esuk1036:esuk1037:esuk1038:esuk1039:esuk1040:esuk1041:esuk1042:esuk1043:esuk1044:esuk1045:esuk1046:esuk1047:esuk1048:esuk1049:esuk1050:esuk1051:esuk1052:esuk1053:esuk1054:esuk1055:esuk1056:esuk1057,access=@10.128.1 /local/home"

XFS "-o root=esuk1ds1-p:esuk1ds2-p:nfslocal:esuk1man:esuk1010:esuk1011:esuk1012:esuk1013:esuk1014:esuk1015:esuk1016:esuk1017:esuk1018:esuk1019:esuk1020:esuk1021:esuk1022:esuk1023:esuk1024:esuk1025:esuk1026:esuk1027:esuk1028:esuk1029:esuk1030:esuk1031:esuk1032:esuk1033:esuk1034:esuk1035:esuk1036:esuk1037:esuk1038:esuk1039:esuk1040:esuk1041:esuk1042:esuk1043:esuk1044:esuk1045:esuk1046:esuk1047:esuk1048:esuk1049:esuk1050:esuk1051:esuk1052:esuk1053:esuk1054:esuk1055:esuk1056:esuk1057,access=@10.128.1 /local/opt"

XFS "-o root=esuk1ds1-p:esuk1ds2-p:nfslocal:esuk1man:esuk1010:esuk1011:esuk1012:esuk1013:esuk1014:esuk1015:esuk1016:esuk1017:esuk1018:esuk1019:esuk1020:esuk1021:esuk1022:esuk1023:esuk1024:esuk1025:esuk1026:esuk1027:esuk1028:esuk1029:esuk1030:esuk1031:esuk1032:esuk1033:esuk1034:esuk1035:esuk1036:esuk1037:esuk1038:esuk1039:esuk1040:esuk1041:esuk1042:esuk1043:esuk1044:esuk1045:esuk1046:esuk1047:esuk1048:esuk1049:esuk1050:esuk1051:esuk1052:esuk1053:esuk1054:esuk1055:esuk1056:esuk1057,access=@10.128.1 /local/a"


Working XFS configuration
=========================
XFS "-o rw @esuk1ds-farm:/local/home"
XFS "-o rw @esuk1ds-farm:/local/opt"
XFS "-o rw @esuk1ds-farm:/local/a"
Serviceguard for Linux
Honored Contributor

Re: Serviceguard configuration problem with cluster lock

Thanks for posting your discovery. I know folks do check older threads and this will likely help someone in the future.