Operating System - OpenVMS
1828442 Members
2918 Online
109977 Solutions
New Discussion

Disk IO without doing IO's

 
Wim Van den Wyngaert
Honored Contributor

Disk IO without doing IO's

I have found a cluster on which a program is doing strange things.

Normal situation : program is not having significant thruput to any disk.

Current situation : program is doing top thruput to some disks (not always the same disk).

In VPA for 21:00 - 23:59
1) bufio almost 0
2) direct IO rate : 0.004 (continuously)
3) disk IO about 100 per sec, thruput about 4 MB per sec (item DISKTP, to disk dsa2)
4) IO size about 80 pages
5) no top hot files

My questions.
1) how to anylyze what is causing this thruput ?
2) how can there be disk io without direct IO ?

VMS 7.3. Also seen on 6.2 but difficult to confirm.

Wim



Wim
43 REPLIES 43
Willem Grooters
Honored Contributor

Re: Disk IO without doing IO's

Wim,

could it be the program bypasses RMS to access the disk (Logical or eve physical IO)? It might be scanning the disk for some reason, reading chunks of data this way (fast access). Or accessing particular locations?
I guess yuo'd need to check the system to locate the program that does these strange things and analyze it's device access.
Willem Grooters
OpenVMS Developer & System Manager
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

Willem,

If I do sda show proc/chan I see :

channel to dsa10
some exe stuff
std in/out
4 bg devices of which 3 are busy
1 has 3 operations, 1 has 433K op, 1 has 744K and the non-busy has 0 operation. This after 23 days uptime. So the BG device can't be blamed ?

How can I check the physical and logical IO ?

Wim

Wim
Andy Bustamante
Honored Contributor

Re: Disk IO without doing IO's

What do your paging figures look like?

Does the the program use global sections backed by files? In this case VMS will report I/O as a hard page fault.

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Richard White_5
Advisor

Re: Disk IO without doing IO's

Good Morning Wim...

Very interesting problem... Both Andy and Willem have suggested excellent avenues to pursue. You mentioned that the I/O is not always to the same disk-volume, so Andy's suggestion of possible Paging-I/O may very well be valid, especially if there are Page-Files on the same disk(s) that DISKTP was showing the I/O activity.

As for Willem's suggestion of the QIO's possibly bypassing RMS, you might be able to use "SDA> sho proc/rms/ind=xxxx "... If the particular process is using RMS to execute RMS_Services (Open, Get, Put, etc.) to a particular file; then the 6th and 7th screens from the SDA> command will show you some of the pertinent info.

For example the sixth screen will display the Channel-Control-Block Address, the Unit-Control_Block Address of the Disk-Volume, and the Window-Control-Block Address that contains the retrieval pointers to the file.

The seventh screen of "SDA>sho proc/rms" will display the File-Control-Block with File-Header info, and the total number of Reads and Writes (per RMS) are also displayed on this seventh screen. Some of the prelim screens will display the FAB/RAB and XAB RMS-Info. If RMS_Services are not being utilized for the QIO, then a message indicating No_Image/RMS structures are loaded, will be displayed back to SDA>.

Still trying to pursue a method for tracking Phys_IO vs. Log_IO vs. Virt_IO...

Thanx,
whynot3k


Robert Gezelter
Honored Contributor

Re: Disk IO without doing IO's

Wlm,

To separate the paging traffic from the regular IO, consider using a virtual disk (LD). Then, the non-paging traffic can be traced and the details analyzed.

- Bob Gezelter, http://www.rlgsc.com
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

The process is an interface between a program and another process on the machine. Or on another node of the same cluster. It is using TCP/IP. Has nothing on disk. Details unknown.

Andy : no paging activity.

Richard : I have currently no channel visible to anything on dsa2. So, show proc/rms can't show me something. And VPA shows no file activity.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

Rather strange news :

Yesterday between 18 and 19h there was activity by the process going to dsa4. And dsa4 contains only Sybase database files. And the process has nothing to do with that disk but still there are IO's registered on that process ...

A bug in the program that confuses VPA ???

Wim
Wim
Willem Grooters
Honored Contributor

Re: Disk IO without doing IO's

Wim,

Given the process uses BG-devices - obviousy since it uses TCPIP - you can find out which ports it uses. Therefore a suggestion, to get at least a clue what's going on: set up TCPTRACE to listen to the busy ports and dumping what's found. That would only be feasable if these ports are fixed, of course.
Perhaps it may give a clue what's going on within that proces.

Willem Grooters
OpenVMS Developer & System Manager
Richard White_5
Advisor

Re: Disk IO without doing IO's

Good Morning Wim...

Becoming more interesting by the hour... Is it possible that this process/program interface/communicates with an Application Server/Control Process, possibly via mailboxes or global_sections? Or is this a spawned-sub-process or multi-threaded process?

I realize that you are being requested to try multiple/different suggestions, but if you are able to find the time during the next episode, I would like to request an ouptut-log-file or screen-capture if possible.

If you could capture and forward the output of the following SDA> commands, (executed 3 or 4 times,) it may jog someone's memory (Andy, Willem, yourself, myself, etc.)

SDA> show proc/ind=xxxx
SDA> show proc/sema
SDA> show proc/proc (or /pst)
SDA> show proc/reg
SDA> show proc/imag
SDA> show proc/chan
SDA> show proc/lock
SDA> sho proc/rms

I refrain from suggesting SDA>show proc/all since I don't think that we need the WSLE's or PTE's at this point...

Thanx,
whynot3k
Cass Witkowski
Trusted Contributor

Re: Disk IO without doing IO's

Direct I/O does not always mean disk I/O. Direct I/O means that the data buffer is in the user's space and that the I/O device transfers data to or from that location.

In most if not all cases it means that DMA is being used in the transfer. The transfer takes less time that the user process could be swapped out right after the transfer has completed if necessary.

Buffered I/O means that the data in the user's space is first copied to a buffer in system's space. From this system buffer it is then transferred out to the device. Typically this was program I/O with one byte or word of data being transferred at a time. Old DZ11 serial ports come to mind. Ethernet devices use DMA now days so their transfers are typically direct I/O.

Cass
Uwe Zessin
Honored Contributor

Re: Disk IO without doing IO's

It is the device driver writer who decides if the IO is counted against BIO or DIO.
.
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

But look at 1 and 2 : there is disk IO and NO direct nor buffered IO.

Wim
Wim
Cass Witkowski
Trusted Contributor

Re: Disk IO without doing IO's

Is high water marking turned on? Is there a process called I think BAD BLOCK running? Is this a shadowed disk member?
Richard White_5
Advisor

Re: Disk IO without doing IO's

Good Morning Wim...

>> Yesterday between 18 and 19h there was activity by the process going to dsa4. ... And the process has nothing to do with that disk but still there are IO's registered on that process ... << Exactly which utility is showing the Disk_I/O's being charged or registered to this "forsaken" process; and yet there is virtually no DIO's or BIO's against this process? ($monitor, $show proc, DecPS, etc.)

As Uwe mentioned, it is the device-driver, in particular, the Quadword-Mask fields at the front of the FDT-Routines that help determine whether or not the QIO's are Buff_I/O's or Dir_I/O's... I do not know of any other type of I/O that is processed by the FDT..

Is it possible to capture some of the process-context, (Image/Process-Sections, Channel-Control-Blocks, User/Kernel-Stack, Call-Frames from SDA ?)

Thanx,
whynot3k

Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

All findings are coming from VPA.

First SDA snapshot at 18:00.

VPA "display top", "Top Volumes by", "Image name" says the problem started at 18:00 sharp. The disks the process IPCDAEMON attacked was dsa1 (label disk1, rate 100IO/s) and dsa3 (label disk3, rate a little lower than dsa1). Thruput of the image to disk peaked at 7.5 MB/sec.

The snapshot may be taken just before the problem started.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

Second snapshot at 18:05. Problem was still present.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

Richard,

I'm only a SDA instructions repeater. Could you give me the statements for obtaining the data you need ?

Wim
(btw your morning is our evening/night)
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

I did a show proc/con.
Not a single field is changing except that the cur priority toggles sometimes from 9 to 8 and directly back to 9 exactly every 1 minute. The disk IO rate during my 5 minutes test indicated 15 IO's/sec.

Wim
Wim
Uwe Zessin
Honored Contributor

Re: Disk IO without doing IO's

Hm, does 'SHOW PROCESS/CONTINUOUS' send any SKASTs to the target process to collect some data? It sounds like the target is receiving a priority boost from somewhere.

From Heisenberg with Love ;-)
.
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

The disk IO's are also visible in mon disks.

Cas : no high water, bad. This is cluster with interbuilding shadowing.

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

Did a show proc with lexicals of everything available via lexicals.

CPUTIM goes up with 1 ms every few seconds. No other data changed. Not sleeping after all.

Wim
Wim
Uwe Zessin
Honored Contributor

Re: Disk IO without doing IO's

How can that be? The granularity of JPI$_CPUTIM is 10ms.

http://h71000.www7.hp.com/doc/73final/4527/4527pro_045.html#index_x_573
.
Wim Van den Wyngaert
Honored Contributor

Re: Disk IO without doing IO's

Sorry Uwe, your right. It was per 10 ms.
But I did the test during a longer interval now.

CPU is sometimes frozen for some minutes, sometimes increases every few seconds.

PRI was reported on 4, 7, 8 and 9.

Did VPA sampling with interval 1 ms for the process. No samples were found when trying to report it.

Wim
Wim
Uwe Zessin
Honored Contributor

Re: Disk IO without doing IO's

I would not try to make sense out of the CPU time data. It is possible to construct cases where a process is doing work, but does not get any CPU time charged - if I recall correctly, that only happens at quantum end processing. If you process gives up the CPU 'too early' you won't see an increase.
.