Operating System - OpenVMS
1830862 Members
2525 Online
110017 Solutions
New Discussion

Re: volume stuck in mount verify

 
Tim Nelson
Honored Contributor

volume stuck in mount verify

Ran into a situation and HP was of little help other than the Windows solution ( reboot ).

Had one volume out of many that seemed to be switching paths and consequentially executing mount verifys. The volume is now stuck in mount verify.

Environment:
two node cluster OVMS 7.3-1
Multipathed SAN connected Symmetrix storage.

All other volumes are working from both nodes both paths no problems.

-specific volume on symmetrix side shows no errors, ready, and write enabled.
-node 1 show dev/mount shows mounted but access to volume hangs.
-node 2 show dev/mount shows mountverify and any access hangs.

Any way to recover from this other than rebooting ? dismount/abort does nothing. No errors reported on volume from either node.

9 REPLIES 9
Peter Quodling
Trusted Contributor

Re: volume stuck in mount verify

Suggestion 1. Replace the Symmetrix with real storage... :-)

Likelihood is that something still thinks it has an open connection to this storage.


Any processes in resource waits, on eaither cluster member. If so, have you tried walking through that processes's io with SDA?

what does show dev/files return?

What setting do you have for sysgem param mvtimeout?
Do you have vioc turned on?

q
Leave the Money on the Fridge.
Tim Nelson
Honored Contributor

Re: volume stuck in mount verify

1) If HP had real enterprise storage solutions I would :) ( you started it)

2) Will have to verify the rest. Will post the remainder tomorrow.

Thanks for the initial tips.

John Gillings
Honored Contributor

Re: volume stuck in mount verify

Tim,

If the volume has reached mount verification TIMEOUT, then the only way to recover that device is a reboot (that's the *purpose* of MVTIMEOUT). However, with most controller based storage, it's not necessary that the device is accessed as the same physical device name. You may be able to go to the controller and re-present the device on a different LUN. OpenVMS then sees it as a new physical device, which you may be able to mount. That all depends on what state the controller thinks it's in.

If you haven't reached timeout, you may be able to force a path switch with SET DEVICE/SWITCH. This *might* recover from mount verify.

V7.3-1 is now unsupported. You should install V7.3-2 (and please don't tell me your application is stuck - it isn't!). Installing V7.3-2 might not fix the problem, BUT at least if it doesn't you will be able to elevate it. If you attempted to elevate this case at the moment, the first step would be to instal V7.3-2 plus all rating 1 patches.
A crucible of informative mistakes
Robert Brooks_1
Honored Contributor

Re: volume stuck in mount verify

I agree with everything John Gillings said . . .


If the device was switching paths automatically (that is, no one was issuing DCL commands to manually switch paths), then VMS (specifically, the Multipath layer that manages multiple paths) was doing its job. In response to an I/O error that is subject to mount verification, multipath will probe all the paths to the device to find a working path. Multipath will not switch to a path unless it has verified that the path is working. However, when that probing is done, we just issue reads; no writes are attempted.

One can have the rather odd (but we've seen it in our lab) where one could read from, but not write to, a device. So, this device would exhibit the rather annoying behaviour of entering mount verification, chosing a path, exiting mount verification, and then starting the whole routine over again, when
a write I/O came along. We call this the "bent pin problem", because the root cause was a bent pin on a SCSI connector such that no writes would ever complete.

So, my guess is that there is something wrong with the LUN that is causing both nodes in the cluster to have connectivity problems.

-- Rob (Multipath wonk)
Peter Quodling
Trusted Contributor

Re: volume stuck in mount verify

Points back to our favourite enterprise storage supplier...

re .last

Wonk? Wepository Of Nearly-all Knowledge?

Asking my Best friend (Google) http://www.wordsmith.org/words/wonk.html says

An expert who studies a subject or issue thoroughly and excessively.
Personally, I prefer the description "Thaumaturge" - google it yourself. (There used to be a guy at DEC with it as his title in his business cards.)
Leave the Money on the Fridge.
Bart Zorn_1
Trusted Contributor

Re: volume stuck in mount verify

Even on OpenVMS V7.3-2 there are already many ECO's issued which correct all kinds of Fibre and SCSI problems. HP will no doubt request you to upgrade to 7.3-2 plus ECO's before they dive deeper in the problem, that is, if you open a case with them.

We have had many problems with both EMC and HDS storage, but the last few months things have stabilized.

By the way: from an OpenVMS point of view I most certainly do not consider EMC nor HDS "Enterprise" storage. There is more to "Enterprise" than sheer volume and performance. Some basic support from either vendor for OpenVMS would be a requirement too!

Regards,

Bart Zorn
Thomas Simpson
Advisor

Re: volume stuck in mount verify

BTW - The root problem for this issue turned out to be an EMC issue. The disk in question had been used previously in a Windows environment and apparently the normal EMC initialization process left some traces behind which screwed things up when the volume was re-used in the VMS environment. There is some additional low-level initialization that needs to be done on any volume that has been used by Windows. Apparently Unix had similar issues with the same disk. Tim can provide additional details.
Eberhard Wacker
Valued Contributor

Re: volume stuck in mount verify

Very interesting, BTW: was there an init/erase on DCL level made before trying to use it on VMS ?
Cheers,
EW
Tim Nelson
Honored Contributor

Re: volume stuck in mount verify

Here is the rest of the story that still is being resolved.

After reboot I moved the volume in question to a separate VMS host to continue troubleshooting. Same problem occurs ( hung in mount verify) even after all relative fibre/scsi patches.

Then moved volume to HPUX server. HPUX reported an IO error when writing to this disk.

Aha, there is something desparately wrong with this volume.

Had EMC VTOC the volume in question. This should take care of it.

After VTOC re-created meta and assigned back to HPUX server. Volume then hung during newfs command.

Currently waiting for more options from EMC.

Will keep you updated.