Operating System - OpenVMS
1752571 Members
5235 Online
108788 Solutions
New Discussion

Using TDC to analyse Disk Queue Length

 
Francisco Hermida
Occasional Advisor

Using TDC to analyse Disk Queue Length

Hello, 

We are experiencing some puctual performance problems and suspect it is related with disk.

I would like to get info from MONITOR DISK /ITEM=QUEUE /INT=1 in a CSV format.

I tried TDC tool and TDC API, but I cannot find any field showing disk queue length.

I thought it should be TDC_DSK_QLEN as shown in SYS$COMMON:[TDC.SDK.HEADERS]TDC_DSK.H but it is always 0 in all captures, while MONITOR shows values different from 0.

Any idea how to get disk queue length?

More info:

    /*
    ** TDC_DSK_QLEN [C]: the value returned by SYS$GETSPI for item
    ** SPI$_DISKS.
    ** This is the accumulated count of queued I/O operations to the volume
    ** (those which could not be immediately performed).
    ** Average response time can be derived from this value and TDC_DSK_OPCNT.
    */
    TDC_DSK_QLEN,
TDC> show version


TDC$CP: TDC$CP, V2.3-30020 for HP OpenVMS (I64) V8.3, 24-Jun-2009:06:46:39
TDC$APISHR: TDC$APISHR, V2.3-30020 for HP OpenVMS (I64) V8.3, 24-Jun-2009:06:46:39
TDC$LIBSHR: TDC$LIBSHR, V2.3-30020 for HP OpenVMS (I64) V8.3, 24-Jun-2009:06:46:39
TDC_EXEC_C: TDC_EXEC_C$I_V830-0203-0020.EXE
3 REPLIES 3
abrsvc
Respected Contributor

Re: Using TDC to analyse Disk Queue Length

There is little information here for us to coment much, but...

Other than Monitor showing a queue length of >0 on occasion, what is the performance problem you are seeing?

It is normal to see non-zero queue lengths for short periods when applications get "synch'd" and there is an overwhelming amount of I/O at a time.  If the queues persist, than there is likely a problem.  Are the queues always non-zero or just once in a while?  It is possible that the TDC collection period is such that it is not seeing the non-zero queues.

Gather monitor data (or TDC) for a longer period at perhaps a shorter collection interval and analyze the data.  Look for periodic instances of non-zero queues.

Any additional information you can provide about the environment will help too.  For example: machine type, memory size, OpenVMS version, network software and version etc.

Dan

Hoff
Honored Contributor

Re: Using TDC to analyse Disk Queue Length

Ignoring that the CSV format is more arcane and problematic and incompatible than some folks might realize, the free HPE T4 tool can extract this and other data from data recorded by MONITOR.    Or MONITOR can record data and the format of the MONITOR data is documented in the manuals, if you want to put together some code to open the data file and use libcsv or such.

exe$getspi is undocumented, unsupported and is somewhat sketchy as interfaces go, and is used underneath MONITOR.    You'll usually need the source listings or a fair amount of rummaging to get that to work.   With some coding errors in the calling code, an occasional system crash in exe$getspi wouldn't surprise me, either.   Not an approach I'd particularly recommend as easy or maintainable or compatible, and I'd suggest avoiding debugging related application code on a production server.

The documented and supported version of that exe$getspi API is sys$getrmi.  Unfortunately AFAIK that does not report device I/O statistics.  That's an omission that VSI might consider reso

Another alternative that's sometimes useful for these is SNMP (unfortunately SNMPv2, insecure), but I don't see the disk queue length information available from that, either.

TL;DR: use MONITOR to record the data, and open and process the MONITOR data, and generate CSV yourself.   Reading the data file is a slog, but everything necessary for that code is available and documented, and there may be (are?) examples posted around the 'net.   Or use T4, which does most of this slog for you.

Francisco Hermida
Occasional Advisor

Re: Using TDC to analyse Disk Queue Length

Thank you, I will probably give an opportunity to T4 and friends. However, TDC is documented and is installed as a layered product in new versions of OpenVMS by default. I thought it would be the official way of getting system performance information.

Moreover, getting performance information into a text file (CSV or whatever) is just the first step. By using TDC API, I was able to send live process performance data via the network to a sort of central performance monitoring system where it can be stored.

I have only one mirrored disk in my system, so writing MONITOR data into a disk which I suspect is slowing down the system is not my desirable option. The problem is puntual so I need to log continuosly each process Direct I/O and disk queue length and operation rate, and afterwards perform an offine analysis.