Operating System - OpenVMS
1753710 Members
4441 Online
108799 Solutions
New Discussion юеВ

Re: LOGOUT puts 2 processes in infinite loops

 
Jess Goodman
Esteemed Contributor

LOGOUT puts 2 processes in infinite loops

Friday I logged off all my telnet sessions to VMS and turned off my computer. Today I noticed that two of my processes on two different Alphas, (V7.3-2 and V6.2-1H3) are in kernel-mode infinite loops with the DELPEN process bit set.

- CPU constantly accumulating in kernel mode
- No I/Os going on
- Can't suspend them or change base priority
- no images active
- no locks taken out
- no timers
- only channel 0010 shows (system disk no file)
- CLUE PROCESS/RECALL says last command is LOG
- TNA terminal devices do not exist
- no sockets associates with processes

Attached is output from PCS trace.

Any possibilities other than rebooting both systems?
I have one, but it's personal.
19 REPLIES 19
Jess Goodman
Esteemed Contributor

Re: LOGOUT puts 2 processes in infinite loops

Sorry, here's the attachment
I have one, but it's personal.
Hoff
Honored Contributor

Re: LOGOUT puts 2 processes in infinite loops

The attachment didn't unpack quite right locally, though I'm not working on an OpenVMS box right now and it may well be a VFC-ism.

Take a look at the process quotas. (You may have to FORMAT the PCB and JIB structures directly, if SDA won't let you at the pieces through its normal commands.) Look specifically at the I/O counts in the PCBs and over in the JIB. See if an I/O is missing -- a lost I/O somewhere, or some lost quota.

If this is a lost quota count, you can choose to bump the quotas to fake the rundown into completion (patching from the console, or via some kernel-mode hackery), or you can use this opportunity to reboot to apply the ECOs and then reboot.

Where the I/O went -- if this is a lost I/O -- is a whole 'nother discussion.

As for ECOs to apply, the usual all-mandatory and current UPDATE kits, current IP kit, and toss in any kernel-mode code you may be using on these two systems. For grins, check for ECOs to the disk drivers and any device firmware.


John Gillings
Honored Contributor

Re: LOGOUT puts 2 processes in infinite loops

Jess,

Another instance of a system that needs a reboot to recover? :-(

You may need to further symbolize the PC samples. In particular the TCPIP$INTERNET_SERVICES calls.

Guessing, the trace looks like you're stuck walking the open channels list, trying to deallocate devices. For some reason one or more devices haven't been removed. Check the sources for IOC$SCAN_IODB and see what structure it scans. Examine it from SDA to try to determine why there's junk left on it.

If (when?) you reboot, make sure you force a crash and submit the dump to HP for analysis.
A crucible of informative mistakes
Jess Goodman
Esteemed Contributor

Re: LOGOUT puts 2 processes in infinite loops

Thanks for the responses so far. Here is SDA>SHOW PROCESS from the VMS 7.3-2 (fully patched) system:

Direct I/O count/limit 189/199
Buffered I/O count/limit 209/199
BUFIO byte count/limit 512000/512000
ASTs remaining 199/199
Timer entries remaining 199/199

Process index: 03F0 Name: _TNA213: Extended PID: 41A7BFF0

Process active channels
Channel CCB Window Status Device accessed
0010 7FF60000 00000000 AX38$DKA0:
Total number of open channels : 1.

So subtracting count from limit VMS believes that there are 10 active DIOs and -10 active BIOs. The Availability Manager process quota display is DIO: 10/199, BIO: -10/199.

But there are no active I/O channels. No job quotas are in use other than a small amount of page file.

On the V6.2-1H3 system it is much the same except only 1 DIO in use and only(!) -1 BIO in use.

Is there some kernel mode code hack I can use to zero the active I/O counters?
I have one, but it's personal.
Dean McGorrill
Valued Contributor

Re: LOGOUT puts 2 processes in infinite loops

hi Jess,
by the look, a guess, it looks like its trying to tear down the conx. interesting we see a call to LAN$COMPLETE_XMT_CSMACD_C which might imply
it doesn't think the conx is closed. May I
ask, what did you use to connect to the vms
boxes? Dean
Jess Goodman
Esteemed Contributor

Re: LOGOUT puts 2 processes in infinite loops

I use PowerTerm to connect via TELNET from my Windows 2000Pro system.
I have one, but it's personal.
Dean McGorrill
Valued Contributor

Re: LOGOUT puts 2 processes in infinite loops

Ok, well you issued a logout and didn't just
abort powerterm. I use powerterm to daily, no problems. from your trace, I see the process attempting to dealloc a device, aquiring and releasing spinlocks. probably really just the iolock8 spinlock. curious how much smp time, from a monitor modes. At the very least, gather as much info as you can, and as suggested get
a crashdump of the system for HP to look
at. those pc snapshots should help them
see where the code is traversing. Dean
Hoff
Honored Contributor

Re: LOGOUT puts 2 processes in infinite loops

>>>But there are no active I/O channels. No job quotas are in use other than a small amount of page file.<<<

And therein lies the problem.

Assuming you have current TCP/IP Services ECOs (and assuming the IP stack is the HP stack) and you have a support contract, force a crash and send it along to HP.

Something definitely looks to be leaking.

>>>Is there some kernel mode code hack I can use to zero the active I/O counters?<<<

I usually roll a specific case, and a little code behind a $cmkrnl call. There's a variation of what you need to do here posted over in a process deletion thread; I tossed a pointer to some kernel-mode code that clears the NODELET bit in the PCB somewhere here in ITRC within the last couple of days. Try a google search with the site:forums1.itrc.hp.com keyword. Look for that, and NODELET and RBH, or such.

I also prefer to test such code somewhere else-node, before setting it loose on a production server. If you're fast and cluster timers are set tolerant and you're on a "continue-able" system, you can halt and bomb core and continue from the console prompt.

Patching the value doesn't mean this problem goes away; if that I/O is eventually "found" or if extra and otherwise lost I/O is built up somewhere, a Bad Thing could happen to the process(es) involved, or to OpenVMS itself.
Jess Goodman
Esteemed Contributor

Re: LOGOUT puts 2 processes in infinite loops

Ok, thanks again for the replies. Since I couldn't find any code to help I wrote my own kernel mode program that clears the DELPEN bit of the PCB$L_STS longword of a given process. I have attached it here in case anyone else might find it useful.

After running this program on my systems I could then use SHOW PROCESS commands for the problem processes and eventually I did a SET PROCESS/SUSPEND=KERNEL, which worked; so the processes are no longer burning CPU cycles.

My program should work on VAX/ALPHA VMS 6.2 and above and I would guess only a minor change is necessary for Integrity.
I have one, but it's personal.