Operating System - HP-UX
1851310 Members
2653 Online
104057 Solutions
New Discussion

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

 
SOLVED
Go to solution
julio quadros
Advisor

Cluster lock disk /dev/dsk/c6t0d0 has failed

Hello

I have a cluster of 2 machines. The packages use disks in the FC60 arrays. A month ago there was a problem with the FC60 while replacing a bad disk and as I was not here another person reconfigured the disk array. Since then, every time the cluster checks the lock disk hourly, it records a warning "Cluster lock disk /dev/dsk/c6t0d0 has failed". Every now and then (erratically) in syslog i get:
"Unexpected error from cluster lock query (lock_id=0): No such device or address"

In cmclconf.ascii i have:
FIRST_CLUSTER_LOCK_VG /dev/vgdbs
for node 1:
FIRST_CLUSTER_LOCK_PV /dev/dsk/c4t0d0
for node 2:
FIRST_CLUSTER_LOCK_PV /dev/dsk/c6t0d0

c4t0d0 is an alternate link for c6t0d0

I've done a cmquerycl and under "Possible Cluster Lock Devices" I have both c6t0d0 and c4t0d0.

I checked other articles on this subject but I got confused. Anyone has a magical idea to solve this problem ?

Thanks a lot for your support

JQ
22 REPLIES 22
Jeff Schussele
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi Julio,

When a disk is replaced in an FC array, you have to run
fcmsutil /dev/tdX replace_dsk
where X equals the td instance.
You may need a vgchange -a y vg_name as well.

HTH,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

just one more thing:

Today I started getting on both nodes this message in syslog:

cmclconfd[6824]: Unable to attach to network interface 2

What is network interface 2 ?

On both nodes I have 3 interfaces configured:

lan0 - primary
lan3 - standby
lan1 - heartbeat

in both nodes i see the 3 interfaces UP.

Any ideas ?

Thanks

JQ
Sridhar Bhaskarla
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi,

Since you replaced the disk, the lock structure might have been lost on the disk. You will need to restore it. However, you will need to make sure that the device files didn't get changed during the maintenance.

#diskinfo /dev/rdsk/c4t0d0 on node1.

If it is successful, the disk is there without lock structures. You can use vgcfgrestore command to restore the structure.

#vgcfgrestore -n vgxx /dev/rdsk/c4t0d0

However, if you are still getting errors means your previous vgcfgbackup didn't save the lock structures. In this case, you will have to halt the cluster and refresh the cluster configuration.

#cmhaltpkg -v your_package
#cmhaltcl
#cmapplyconf -C /etc/cmcluster/your_cluster_ascii

This should restore the cluster lock structures on the lock disk.

If your diskinfo is not successful, then the devicefiles got changed. You will need to halt the cluster, export and reimport the volume group that has this disk on both the nodes. Modify your cluster ascii file with the new device files and then do cmapplyconf as indicated above.

If you didn't have good vgcfgbackup and the device files are not changed, then you can use "cminitlock" command while the cluster is still running. It is a contrib tool provided by HP. However, I would suggest a fresh cmapplyconf to make things cleaner.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Sridhar Bhaskarla
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi,

Do you see any "down" status of your network interfaces when your run "cmviewcl -v" command?.

You should be able to figure it out from the above command.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Jeff, thanks for your reply. As I said, I was not here when the array was reconfigured, therefore I am not sure what has been done. On top of that, I know absolutely nothing about FC arrays. I've checked and there are no td devices under /dev on both nodes. Is this ok ? My FC60 is model A5277A (ioscan output).

Sridhar, thanks for your replies.
a) As I said, i see on both nodes the 3 interfaces up. How ? Using the "cmviewcl -v" command, of course :)

b) lock disk problem

1. Following your advice, I did diskinfo /dev/rdks/c4t0d0 on node 1 with success.

2. On node 2, the timestamp of vgdbs.conf is from December 2002, and the problem with the FC array was in January this year. Is there a chance that this file has the correct lock structures ?

3. What are the risks of screwing up anything if I try a "vgcfgrestore -n /dev/vgdbs /dev/rdsk/c6t0d0" on node 2 right now ? Will anything happen to my data ?????? Shall I do a vgcfgbackup to an alternate file first so that I can restore it later if anything goes wrong ? Can I do this with the cluster and packages up and running ?

4. A week ago I had a problem on node 1 and had to recreate the vgdbs with vgimport, using the mapfile produced by "vgexport -p -v" on node 2. After the recreation, I had to vgextend because the alternate link was not there (vgdisplay -v didn't show the alternate link). Can this be a reason for vgdbs.conf being different on both nodes now (using diff) ? I thought vgimport would recreate the vg exactly the way it is exported on the other node but it didn??t work that way.

5. Last but not least, i tried 2 hours ago to cmapplyconf with both cluster and packages up. Nothing changed, still lock disk error. Does this mean the cluster must be down for cmapplyconf to work ?

Thank you guys

JQ
Sridhar Bhaskarla
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi JQ,

There would be no guarantee that your previous vgcfgbackup is good. If this is a production server, then I suggest you take a conservative approach. Take a maintenance window. Halt the cluster, delete the cluster configuration and recreate it. This will refresh the lock structures. Simply applying the configuration will not rebuild the configuration as there are no changes to your ascii file.

You will have no doubts if you follow the above procedure.


-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Chris Wong
Trusted Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

The cluster lock is located in the BBRT of the VG. A vgcfgrestore does not restore this area. You must:
vgchange -ay
(Lock disk not initialized)
Halt the cluster
Bring up the cluster
Reinitializes lock disk

- Chris
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi Sridhar

Ok, I got your point. So, as far as I can understand, you suggest that I do:

# cmhaltcl -f

# backup cluster and packages ascii files

# cmdeleteconf -c cluster_name

# restore the ascii files

# cmcheckconf -C /etc/cmcluster/cmclconf.ascii

# cmapplyconf -C /etc/cmcluster/cmclconf.ascii

# cmapplyconf -P for every package

Is this correct ?

Please don??t forget network interface problem.

Thanks

JQ

julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

One last thing. After the lock disk problem is sorted out, is it advisable to do a vgcfgbackup of all vg??s ?

Thanks

JQ
Sridhar Bhaskarla
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi JQ,

No cmhaltcl -f.

1. cmhaltpkg -v package_name
(for each package)
2. vgchange -c n vgxx
(for all the vgs defined in the cluster ascii file)
3. cmhaltcl
4. vgchange -a y vgxx
(for all the vgs defined in the cluster ascii)
5. cmdeleteconf
6. cmcheckconf -C /etc/cmcluster/cmclconf.ascii
make sure there are no errors. It may prompt for network errors give out the reasons. Do an ifconfig on the standby interfaces and see if they are with "0.0.0.0" IP. If so, do an ifconfig lanx unplumb for the standby interfaces.
7. cmapplyconf -C /etc/cmcluster/cmclconf.ascii
8. start the cluster
cmruncl
9. For each package do
cmapplyconf -P /etc/cmcluster/package_name/package.conf
10. start the packages

This will build you fresh configuration file and everything should be clean from now on.

Yes. Run vgcfgbackup for all the volume groups. Also, please post when you are going to do this so that I can try to be online that time.

-Sri

PS: Simply shutting down the cluster and bringing up may fix your problem. However, re-applying the configuration will not hurt you if you are bringing down the cluster anyway.
You may be disappointed if you fail, but you are doomed if you don't try
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi Srid

Thanks for your support, that's great.
Why 5. cmdeleteconf and 6. cmdeleteconf -C ? 6. without 5. won't do the same job ?

I will be doing this at 12:00am south african time, 2 hours ahead of GMT.

If I run into problems I will post here.

Thanks

JQ
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Srid, sorry, ignore the previous message, of course cmdeleteconf is not the same as cmcheckconf :)

JQ

PS: I will do that operation in 2 hours.
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

HELP HELP HELP

cmcheckconf gives no errors but when I cmapplyconf i get:

Begin cluster verification...
Adding node mcbepms1 to cluster epmsMCB.
Adding node mcbepms2 to cluster epmsMCB.
Protocol failure talking with cmclconfd on mcbepms2 (1) : No such device or address
Error: Volume group /dev/vgdbs does not exist on this node
Error: unable to initialize cluster lock /dev/dsk/c6t0d0 on node mcbepms2.
Check the syslog file on that node for more information.
cmapplyconf : Unable to apply the configuration


in syslog i have:

cmclconfd[24361]: Unable to attach to network interface 2
cmclconfd[24361]: Initializing cluster lock device /dev/dsk/c6t0d0 for node mcbepms2
cmclconfd[24361]: Unable to initialize cluster lock /dev/dsk/c6t0d0, No such device or address



HELP HELP HELP



JQ
Sridhar Bhaskarla
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi JQ,

I believe this is the actual problem that you basically lost PV structures on the lock disk in vgdbs.

Can you do a vgdisplay -v on vgdbs?. You would need to do a vgchange -a y vgdbs and other volume groups. Please post strings /etc/lvmtab.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi Sri

I can vgdisplay -v vgdbs ! All cluster vg's are activated.

Strings /etc/lvmtab gives:

/dev/vg00
/dev/dsk/c1t2d0
/dev/dsk/c2t2d0
/dev/vgora
/dev/dsk/c6t0d1
/dev/dsk/c4t0d1
/dev/vglogs
/dev/dsk/c6t0d4
/dev/dsk/c4t0d4
/dev/vgarch1
/dev/dsk/c6t0d2
/dev/dsk/c4t0d2
/dev/vgdbs
/dev/dsk/c6t0d0
/dev/dsk/c4t0d0

What does it mean "Protocol failure talking ....." ?????

Please help

JQ
Sridhar Bhaskarla
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

JQ,

Can do the following.

On the node where you can do vgdisplay -v vgdbs, do the following

#vgchange -a n vgdbs

On the other node do

#vgchange -a y vgdbs

vgdislay -v vgdbs

And can you post the other errors in /var/adm/syslog/syslog.log?.

-Sri

DO you have yahoo messenger so that we can chat online?
You may be disappointed if you fail, but you are doomed if you don't try
Sridhar Bhaskarla
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

OK.

I believe you did not import vgdbs on mcbepms2 or you lost it. The step I indicated

vgchange -a n vgdbs on pms1 and vgchange -a y vgdbs on pms2 will confirm that. If you are not able to activate vgdbs on pms2, then you will have to generate a map file on pms1 and import the vg on pms2.

About the network interface, I believe it is the standby interface. It might have been disconnected. For a quick test of it, comment it out in your clusterconf file and apply the configuration. If cmapplyconf is successful, then you will need to fix the connectivity issue and then add it back.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Sri, I've vgchange -a n vgdbs on node 2 and vgchange -a y vgdbs on node 1 and everything is fine. No errors in syslog. vgdisplay -v /dev/vgdbs shows the LV and the PV, fine.

I've posted the errors in syslog in the previous message.

I'm downloading yahoo messenger right now.

JQ
Sridhar Bhaskarla
Honored Contributor
Solution

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

Hi JQ,

My id is sbhaskarla@yahoo.com. I hope you do not encounter any issues with the messenger :-).

See you online.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

I am behind a firewall, can't get yahoo messenger to work :(
e-mail me at jquadros@solucoes.co.mz, it would be faster.

JQ
julio quadros
Advisor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

A big THANK to Sri for helping me with this, it's nice to know that somewhere in this crazy world there are people like Sri.

The solution for the lock disk problem was to change the lock VG from vgdbs to vgora. I still wonder what was wrong with vgdbs that made it impossible for it to be the lock vg.

Once again, I have to thank Sri for his kindness.

JQ
Sridhar Bhaskarla
Honored Contributor

Re: Cluster lock disk /dev/dsk/c6t0d0 has failed

JQ,

This is the purpose of these forums and I am glad I could be of help to you.

Since you could be able to apply the configuration and view it with all the network interfaces up with the corresponding linkloops successful, the network error may not be a real issue.

Have a look at the description of the patch
PHSS_27158. You may want to install this patch later.

Also, you will need to get another heartbeat setup. Running with only one heartbeat is not good for the health of your cluster.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try