Re: Disk IO retry - OpenVMS 7.3-2

Kevin Raven (UK) · ‎05-08-2007

We have a four node ES45 cluster. Shared system disk and shared nine other disk. We do not shadow any disk.
We have a EMC storage array that does Raid for us and presents OpenVMS with 10 disk.
We are running OpenVMS 7.3-2 with Update V8. Yes I know we are a little behind with the updates. We access disk via all four node in the cluster with 2 HBA cards per server. Both cards are single port and have fibre cables connected to them. WE do not MSCP serve disk between node.
A few weeks ago someone did a reconfig of some type on the EMC storage. As a result IO to the disk on all 4 node was stalled for 4.9 seconds. Then things resumed.
The EMC support team claim that no other systems connected were effected. These other systems being Windows and Solaris. They also claim that the stall in IO would have only been approx 1 second.
My point is the following:
- The other systems might not measure antthing more than a second of stall as an outage.

- If the outage was only 1 second approx ..then IO would have only stalled for 1 second and not 4.9.
Would this be the case ?
Could a 1 second stall in IO cause VMS to stall IO for approx 4.9 seconds.
Checked the operator logs and other logs. No multi path switching took place during the IO stall.

We consider a .5 second outage as application unavailable. Our cluster is more or less as realtime as you can get ..cluster RECN Interval set as low as 4 seconds and associated params.

Comments ?

Volker Halle · ‎05-08-2007

Kevin,

could an IO error have triggered mount-verification on the disks ?

Mount-verifications might not be logged to OPCOM, see the MVSUPMSG_INTVL and MVSUPMSG_NUM sysgen parameters.

Volker.

Steve-Thompson · ‎05-08-2007

Hi Kevin

My response to this situation...
The word glib comes to mind.

ANY IO delay is unacceptable!

I would say the people managing the EMC box to fix it.
(Ie. if the EMC box was working before, then it can work again).

So what did they change?
Have these delays started on ALL systems since the EMC revision?
Is there something implied by the change to the EMC that implies revising the FABRIC configuration?

You claim there's no path switching going on, which could account for a delay, as there's obviously a problem with the new configuration,

To confirm this, do a:
$sho device /fu

Look and see if all "operations completed"
are where you expect them to be.

Regards
Steven

Kevin Raven (UK) · ‎05-08-2007

The storage guys will never do a config during production time again. So we will not see any further 1 second or 4.9 second IO delays. I just wanted to get to the bottom of how a 1 second delay in IO on the EMC storage (If that was the case !) can transpond into a 4.9 second stall on the VMS servers.

Kevin Raven (UK) · ‎05-08-2007

"Kevin,

could an IO error have triggered mount-verification on the disks ?

Mount-verifications might not be logged to OPCOM, see the MVSUPMSG_INTVL and MVSUPMSG_NUM sysgen parameters.

Volker."

$ mc sysgen show MVSUPMSG
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
MVSUPMSG_INTVL 3600 3600 0 -1 Seconds D
MVSUPMSG_NUM 5 5 0 -1 Pure-numbe D
$

Ian Miller. · ‎05-08-2007

could the error have lead to a queue full being reported back to VMS for that storage controller port then VMS backing off sending I/O for a while?

____________________
Purely Personal Opinion

Wim Van den Wyngaert · ‎05-09-2007

Read this too
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1066685

May be the minimum recovery time is about 5 seconds ?

Wim

Wim

James Cristofero · ‎05-09-2007

I also have EMC storage on Alpha. Running UPDATE V 7 and FIbre-SCSI V9. We have a dedicated DMX3000 attached to 19 Alpha's either 2 HBA's or 4.

While deploying additional storage (HDS) we had a cluster hang. Any attempt to I/O on EMC would "hang" that server.

No Mount verification never came back from the frame. So apparently, not all communication you would expect to see is available on the EMC paths.

We backed out the HDS changes /crashed rebooted and all the I/O was restored.

Jur van der Burg · ‎05-09-2007

The big question is whether the EMC controllers returned an error or not. If they just stalled for one second then there's nothing in VMS that would stall the request for more than that time. If the controller returned an error then mountverification would have kicked in, which may have been surpressed. Now mountverification will stall all I/O's, and issue a packack to DKdriver every second until is gets a response or mountverification timeout (3600 seconds default). The packack issues a scsi test unit ready command. So if that command was delayed by the controller it may explain the delay. From a VMS perspecitive it could be that multipath may add some additional seconds as it participates in the error recovery.

Bottom line is that I think that the controller has returned an error, and that recovery on such a serious event may take a couple of seconds.

Jur.

Robert Brooks_1 · ‎05-09-2007

From a VMS perspecitive it could be that multipath may add some additional seconds as it participates in the error recovery.

--

Multipath can add some additional time, but since multipath does its work in the context of mount verification, you'd expect to see the OPCOM messages. However, if mount verification suppression is enabled (as it is by default), then it's difficult to figure out what's going on.

Attempting to troubleshoot this after the fact is nearly impossible. A Tool to use *while this problem is happening* would be the DKLOG SDA extension, which will log all the SCSI commands and the SCSI statuses coming back from the controller.

-- Rob

Kevin Raven (UK) · ‎05-10-2007

Thanks for the input so far. I will be trying to emulate the IO hang on our development cluster Friday. I will let you know how it goes or what I find out.

Thanks
Kevin

Volker Halle · ‎05-10-2007

Kevin,

when trying to reproduce the IO hang, consider to start some of the OpenVMS 'built-in' SDA extensions to capture some more detailled data.

You can get some help and examples of using them at:

http://eisner.encompasserve.org/~halle/

The following extensions may be useful:

$ ANAL/SYS
SDA> DKLOG
SDA> IO
SDA> FC

Volker.

Wim Van den Wyngaert · ‎05-13-2007

Concerning IO blockage :

I did a test on a 4100 with HSZ70 running 7.3 and found that

1) splitting a shadow set froze IO for 0.2 seconds
2) reforming a shadow set froze IO for 2 seconds
3) upon shadow copy completion (with bitmaps, didn't check it in the test without them), IO's were blocked for 0.3 sec, 0.6 sec and 0.51 sec (3 times a lock was taken ?)

With/without bitmap had no big influence. During the shadow copy some IO's took 0.07 sec instead of 0.01.

Fwiw

Wim

Wim

Wim Van den Wyngaert · ‎05-13-2007

Same test on interbuilding cluster of GS160 with dual HSG80 in each building but only with bitmap.

1) dism 2.5 sec
2) mount 1.9 sec
3) on completion copy 8.9 + 0.6 + 1.9 sec

Wim

Wim

Kevin Raven (UK) · ‎05-13-2007

Did a test Friday with a test/development EMC storage array. A config was written. Some disks that were not live were changed on the EMC storage and the config written.

A looping DCL script wrote time stamps to a flat file. The time stamps were written at a rate of 55 per 1/100th of a second or 55*100 per second.

During the EMC config several IO stalls took place that ranged from .03 seconds to a massive 1.8 seconds.

Now need to run further test.

This was the first pass.

Regards
Kevin

Rob Young_4 · ‎05-14-2007

>During the EMC config several IO stalls took
>place that ranged from .03 seconds to a
>massive 1.8 seconds.

I'll bet you are adding storage and pushing
out zoning changes (RCSNs are the gotchas).

You will want to avoid storage changes and
particularly zoning changes during normal
working hours.

In a previous job I heard about how they
used to merrily make storage and zoning changes
during working hours. Guess what? Real-time
instrument acquisitions don't like long
pauses - do they?

So, painfully all that worked was moved to
off hours - (2 a.m. on weekends).

Welcome to the real-world Neo...

Rob Young_4 · ‎05-14-2007

Kevin,

Another thought ... I realize you probably
aren't zoning on this EMC config. But what
may be happening is when a new hyper/meta is
created and presented it may require the
Symm to momentarily place a global lock
on the cache (or section of cache) to set
aside cache lines for the newly created
hyper/meta. With multiple gigabytes of cache
it may take a while to take the lock out,
do the work and release (>1 second being
a "while").

I have a lot of "may" above - I don't know.
The problem is there is a good deal of
unknowns (to me) about how the Symm cache
works, and I have been digging for a long
time so it might just be a closely held
piece of engineering knowledge (or I haven't
stumbled upon the right person - yet).

You're going to have to open a call with EMC
support and describe your problem, perhaps
they can shed some light.

Commenting on the hang you are experiencing..
it really isn't that great but in a real-time
data acquisition scenario it could well
be unacceptable. My comment about storage
and zoning changes moving to off hours was
based on my personal history with EMC Symms.

Rob

Wim Van den Wyngaert · ‎05-14-2007

Typo in my last post :
3) on completion copy 8.9 + 0.6 + 1.9 sec
must be
3) on completion copy 0.9 + 0.6 + 1.9 sec

Wim

Kevin Raven (UK) · ‎05-14-2007

The config changes were as below ...extract from e-mail from our EMC chaps....

However, the change we did make was to the SCSI3 bit on 3 devices. This array would still go through the same process to prepare and commit the change therefore forcing an IML of the directors. It is at this point where we are seeing a delay. It's normal for the array to behave in this fashion and at this stage it looks like any config change we make is going to effect your servers..............

Rob Young_4 · ‎05-15-2007

> SCSCI 3 bit set, IML the directors

Well you can close the loop on this one.
Curiously, why would 4 drives out of dozens
not have that bit set?

When they went to change control and had
approval for doing this work did they inform
change control that the directors on the Symm
would be rebooting? etc.

Wim Van den Wyngaert · ‎05-15-2007

Not on topic but ...

Today a disk was replaced in a raid set on the GS160/HSG80 (same config as above).

During this operation I monitored the drive and IO was stalled 3 times for 0.1 sec, then 0.4 sec and finaly for 1.6 sec.

Fwiw

Wim

Wim

Wim Van den Wyngaert · ‎05-29-2007

I started a backup from a large file from a disk (HSG80) to the same disk and looked how another process was delayed doing writes to the same disk.

1 out of 5 writes were delayed with by average 0.6 sec !!!

During a backup of the file to another disk 1 out of 2 writes to the FROM disk were delayed with by average 0.3 sec.

During a backup of a file from another disk to the disk the writes were not delayed at all.

During normal operation of the disk (or even a copy of a large file) I saw only delays of a few 1/100 sec.

Fwiw

Wim

Wim

Jur van der Burg · ‎05-29-2007

If you run backup it can throw a number of i/o's at the disk, and seeing a delay of 0.6 seconds is not unusual at all. So it depends on the quota's that you gave the backup process. I've seen people giving it a diolm of 4096, guess what happens with the response time of one i/o if you queue that many i/o's to the disk.

Jur.

Wim Van den Wyngaert · ‎05-29-2007

Jur,

I know. Just had no idea of the size of the delay.

Just saw during normal Sybase operations a delay of 0.8 seconds ! So, now I do the test with an increased priority to be sure I'm first in getting the cpu. But with the same results.

Wim

Wim

Wim Van den Wyngaert · ‎05-29-2007

Zero for me. Bug in script.

It's not 1 out of 5 IO's that's delayed but 10 IO's per minuut when doing CONTINOUS IO.

Idem for 1 out of 2 : it's 30 per minute.

Sorry

Wim

Wim

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Disk IO retry - OpenVMS 7.3-2

Disk IO retry - OpenVMS 7.3-2