- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Disk IO retry - OpenVMS 7.3-2
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2007 08:53 PM
тАО05-08-2007 08:53 PM
We have a EMC storage array that does Raid for us and presents OpenVMS with 10 disk.
We are running OpenVMS 7.3-2 with Update V8. Yes I know we are a little behind with the updates. We access disk via all four node in the cluster with 2 HBA cards per server. Both cards are single port and have fibre cables connected to them. WE do not MSCP serve disk between node.
A few weeks ago someone did a reconfig of some type on the EMC storage. As a result IO to the disk on all 4 node was stalled for 4.9 seconds. Then things resumed.
The EMC support team claim that no other systems connected were effected. These other systems being Windows and Solaris. They also claim that the stall in IO would have only been approx 1 second.
My point is the following:
- The other systems might not measure antthing more than a second of stall as an outage.
- If the outage was only 1 second approx ..then IO would have only stalled for 1 second and not 4.9.
Would this be the case ?
Could a 1 second stall in IO cause VMS to stall IO for approx 4.9 seconds.
Checked the operator logs and other logs. No multi path switching took place during the IO stall.
We consider a .5 second outage as application unavailable. Our cluster is more or less as realtime as you can get ..cluster RECN Interval set as low as 4 seconds and associated params.
Comments ?
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2007 10:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2007 10:16 PM
тАО05-08-2007 10:16 PM
Re: Disk IO retry - OpenVMS 7.3-2
My response to this situation...
The word glib comes to mind.
ANY IO delay is unacceptable!
I would say the people managing the EMC box to fix it.
(Ie. if the EMC box was working before, then it can work again).
So what did they change?
Have these delays started on ALL systems since the EMC revision?
Is there something implied by the change to the EMC that implies revising the FABRIC configuration?
You claim there's no path switching going on, which could account for a delay, as there's obviously a problem with the new configuration,
To confirm this, do a:
$sho device
Look and see if all "operations completed"
are where you expect them to be.
Regards
Steven
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2007 10:27 PM
тАО05-08-2007 10:27 PM
Re: Disk IO retry - OpenVMS 7.3-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2007 10:47 PM
тАО05-08-2007 10:47 PM
Re: Disk IO retry - OpenVMS 7.3-2
could an IO error have triggered mount-verification on the disks ?
Mount-verifications might not be logged to OPCOM, see the MVSUPMSG_INTVL and MVSUPMSG_NUM sysgen parameters.
Volker."
$ mc sysgen show MVSUPMSG
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
MVSUPMSG_INTVL 3600 3600 0 -1 Seconds D
MVSUPMSG_NUM 5 5 0 -1 Pure-numbe D
$
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2007 10:59 PM
тАО05-08-2007 10:59 PM
Re: Disk IO retry - OpenVMS 7.3-2
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2007 02:13 AM
тАО05-09-2007 02:13 AM
Re: Disk IO retry - OpenVMS 7.3-2
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1066685
May be the minimum recovery time is about 5 seconds ?
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2007 02:19 AM
тАО05-09-2007 02:19 AM
Re: Disk IO retry - OpenVMS 7.3-2
While deploying additional storage (HDS) we had a cluster hang. Any attempt to I/O on EMC would "hang" that server.
No Mount verification never came back from the frame. So apparently, not all communication you would expect to see is available on the EMC paths.
We backed out the HDS changes /crashed rebooted and all the I/O was restored.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2007 07:06 AM
тАО05-09-2007 07:06 AM
Re: Disk IO retry - OpenVMS 7.3-2
Bottom line is that I think that the controller has returned an error, and that recovery on such a serious event may take a couple of seconds.
Jur.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2007 08:02 AM
тАО05-09-2007 08:02 AM
Re: Disk IO retry - OpenVMS 7.3-2
--
Multipath can add some additional time, but since multipath does its work in the context of mount verification, you'd expect to see the OPCOM messages. However, if mount verification suppression is enabled (as it is by default), then it's difficult to figure out what's going on.
Attempting to troubleshoot this after the fact is nearly impossible. A Tool to use *while this problem is happening* would be the DKLOG SDA extension, which will log all the SCSI commands and the SCSI statuses coming back from the controller.
-- Rob