Operating System - HP-UX
1748243 Members
4053 Online
108760 Solutions
New Discussion юеВ

Re: EMC Symmetrix LUN lost quorum

 
SOLVED
Go to solution
Ken Englander
Regular Advisor

EMC Symmetrix LUN lost quorum

System is running HP-UX 11i v3, based on the Mar08 release with the Sep09 patch bundles applied. It is running Serviceguard 11.19. THe system is currently being updated to the Sep10 fusion release.

We have encountered a problem with a lost disk quorum on a LUN presented by an EMC Symmetrix. The problem seems associated with unpresenting this LUN from another system where it was not being used. In other words the LUN was presented to two different hosts but only used on one of them. When it was unpresented from the host where it was not being used, we started getting very regular errors logged similar to the following.

I was able to stop this from occurring by stopping the VG from being enabled at boot time. Otherwise, the errors caused nearly continuous problems while TRYing to boot.

Has anyone seen this problem or have any experience with it?

Oct 28 13:58:38 hpux6 vmunix: LVM: WARNING: VG 64 0x0a0000: LV 1: Some I/O requests to this LV are waiting
Oct 28 13:58:38 hpux6 vmunix: indefinitely for an unavailable PV. These requests will be queued until
Oct 28 13:58:38 hpux6 vmunix: the PV becomes available (or a timeout is specified for the LV).
Oct 28 13:58:43 hpux6 vmunix: LVM: VG 64 0x0a0000: Lost quorum.
Oct 28 13:58:43 hpux6 vmunix: This may block configuration changes and I/Os. In
order to reestablish quorum at least 1 of the following PVs (represented by current link) must become available:
Oct 28 13:58:43 hpux6 vmunix: <6 0x00005c>
Oct 28 13:58:43 hpux6 vmunix: LVM: VG 64 0x0a0000: Reestablished quorum.
Oct 28 13:58:43 hpux6 vmunix: LVM: VG 64 0x0a0000: Lost quorum.
Oct 28 13:58:43 hpux6 above message repeats 50 times
Oct 28 13:58:43 hpux6 vmunix: DIAGNOSTIC SYSTEM WARNING:
Oct 28 13:58:43 hpux6 vmunix: This may block configuration changes and I/Os. In
order to reestablish quorum at least 1 of the following PVs (represented by current link) must become available:
10 REPLIES 10
TwoProc
Honored Contributor

Re: EMC Symmetrix LUN lost quorum

You said the error showed up when the lun was unpresented from the host that wasn't using it.

Which host is now showing this error? The hosting using LUN, or the host that isn't?

Seems that it's likely that we're talking about the host that *is* using the lun, but you don't say.

Another question, is the lun appearing/dissapearing from the other host?
We are the people our parents warned us about --Jimmy Buffett
Ken Englander
Regular Advisor

Re: EMC Symmetrix LUN lost quorum

Good question. The errors are occurring on the system that IS trying to use the lun.

We are doing some research here, so I cannot tell you right now if it continues to show up on the host where it should NOT be visible.
Turgay Cavdar
Honored Contributor

Re: EMC Symmetrix LUN lost quorum

Hi Ken,
What is the "pv timeout" value for the disk? As far as i know EMC recommends 90 seconds for disk timeout(default is 30 seconds?). Otherwise you can get these kind of errors when there is Symmetrix├В service├В operations.
rariasn
Honored Contributor

Re: EMC Symmetrix LUN lost quorum

Hi:

#ll /dev/vg*/group | grep 0x0a|more
crw------- 1 root sys 64 0x0a0000 Jun 19 09:18 /dev/vgname/group

# vgdisplay -v vgname

Verify Cur PV and Act PV .

rgs
Ken Englander
Regular Advisor

Re: EMC Symmetrix LUN lost quorum

turgay - i doubt this is related to activity on the Symmetrix - it seems related to changing the presentation of the LUN but why or where do we need to make a change?

rariasn - this has been done essentially - that is how i identified the VG - system user indicates the PV info is correct
chris huys_4
Honored Contributor
Solution

Re: EMC Symmetrix LUN lost quorum

Hi Ken,

Its a emc problem. ;)

Check with cstm the entries in /var/stm/logs/os/logX.raw.cur following the procedure found in the last reply in the following thread.

https://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1450580

Greetz,
Chris
Turgay Cavdar
Honored Contributor

Re: EMC Symmetrix LUN lost quorum

Hi Ken,
>>In other words the LUN was presented to two different hosts but only used on one of them.

I think you were using this volume group in serviceguard as exclusive access. So it is not configured as auto-activate on boot times(on both nodes)??

>>When it was unpresented from the host where it was not being used, we started getting very regular errors logged similar to the following.

So can we say that the problem occured just after you unpresented the device on EMC symetrix and changed zone config on SAN?

>>I was able to stop this from occurring by stopping the VG from being enabled at boot time. Otherwise, the errors caused nearly continuous problems while TRYing to boot.

So are you still getting the errors? If yes when do you get the errors? (Regularly, only at boot time? only at vg activation/deactivation?)

Can you check if the device is really unpresented from the host that is not using the device?
#symmaskdb list assignment -dev XXXX

Have you checked the device status when you get the error?
#symdev show XXXX


Regards.
Ken Englander
Regular Advisor

Re: EMC Symmetrix LUN lost quorum

Right now trying to see if I can the luns online and save off the data.

Chris - Thanks those cstm commands are very useful - the problem lun is logging errors.

turgay

>>>>In other words the LUN was presented to two different hosts but only used on one of them.

>>I think you were using this volume group in serviceguard as exclusive access. So it is not configured as auto-activate on boot times(on both nodes)??

No, that is not why they are presented to more than one node - not sure why it was done but I do not think they were ever actually shared, exclusively or concurrently.

>>>>When it was unpresented from the host where it was not being used, we started getting very regular errors logged similar to the following.

>>So can we say that the problem occured just after you unpresented the device on EMC symetrix and changed zone config on SAN?

Yes - that started the problems.

>>>>I was able to stop this from occurring by stopping the VG from being enabled at boot time. Otherwise, the errors caused nearly continuous problems while TRYing to boot.

>>So are you still getting the errors? If yes when do you get the errors? (Regularly, only at boot time? only at vg activation/deactivation?)

Errors occur if VG is activated and accessed. On one system right now the errors start when I try to use fsck.

>>Can you check if the device is really unpresented from the host that is not using the device?
#symmaskdb list assignment -dev XXXX

>>Have you checked the device status when you get the error?
#symdev show XXXX

What are these programs? They do not seem to be on our system.
Turgay Cavdar
Honored Contributor

Re: EMC Symmetrix LUN lost quorum

Hi Ken,
This programs are in EMC solutions enabler. You may use SMC or COntrol center. I think your problem may result from either you LUN is write disable or not ready. So please check your LUN status on the array.

On our test environment, we set one of our EMC's LUN as write disable and we got the following errors when we try to mount the file system:
Nov 9 13:44:39 test vmunix: LVM: WARNING: VG 64 0x010000: LV 1: Some I/O requests to this LV are waiting
Nov 9 13:44:39 test vmunix: indefinitely for an unavailable PV. These requests will be queued until
Nov 9 13:44:39 test vmunix: the PV becomes available (or a timeout is specified for the LV).
Nov 9 13:44:44 test vmunix: LVM: VG 64 0x010000: Lost quorum.
Nov 9 13:44:44 test vmunix: This may block configuration changes and I/Os. In order to reestablish quorum at least 1 of the following PVs (represented by current link) must become available:
Nov 9 13:44:44 test vmunix: <1 0x00000b>
Nov 9 13:44:44 test vmunix: LVM: VG 64 0x010000: Reestablished quorum.
the following PVs (represented by current link) must become available:
<1 0x00000b>
LVM: VG 64 0x010000: Reestablished quorum.
LVM: VG 64 0x010000: Lost quorum.
This may block configuration changes and I/Os. In order to reestablish quorum at least 1 of the following PVs (represented by current link) must become available:
<1 0x00000b>
LVM: VG 64 0x010000: Reestablished quorum.
LVM: VG 64 0x010000: Lost quorum.
This may block configuration changes and I/Os. In order to reestablish quorum at least 1 of the following PVs (represented by current link) must become available: