Operating System - OpenVMS
1753464 Members
4699 Online
108794 Solutions
New Discussion юеВ

Re: How to remove/stop RWAST process

 
Not applicable

How to remove/stop RWAST process

Hi,

One of the process got stuck in RWAST state and I am unable to remove that process. Could anyone help me to know how to remove/stop this process.

Below are some information about the process:

Output
1)

$ show sys/process=_TNA97:
OpenVMS V7.3-2 on node MEKON8 1-MAR-2008 19:48:39.21 Uptime 172 21:53:45
Pid Process Name State Pri I/O CPU Page flts Pages
00009058 _TNA97: RWAST 6 2660 0 00:00:03.57 2927 232

2)

SDA> show process _TNA97:
Process index: 0058 Name: _TNA97: Extended PID: 00009058
--------------------------------------------------------------------
Process status: 22040003 RES,DELPEN,PHDRES,INTER,ERDACT
status2: 00000001 QUANTUM_RESCHED

PCB address 814B3240 JIB address 81454AC0
PHD address 82AAC000 Swapfile disk address 00000000
KTB vector address 814B352C HWPCB address FFFFFFFF.82AAC080
Callback vector address 00000000 Termination mailbox 0000
Master internal PID 00480058 Subprocess count 0
Creator extended PID 00000000 Creator internal PID 00000000
Previous CPU Id 00000001 Current CPU Id 00000001
Previous ASNSEQ 00000000000052BD Previous ASN 000000000000001A
Initial process priority 4 # open files remaining 996/1000
Delete pending count 1 Direct I/O count/limit 999/1000
UIC [00001,000004] Buffered I/O count/limit 999/1000
Abs time of last event 590C4BEC BUFIO byte count/limit 198400/199232
# of threads 1 ASTs remaining 2995/3000
Swapped copy of LEFC0 00000000 Timer entries remaining 2000/2000
Swapped copy of LEFC1 00000000 Active page table count 0
Global cluster 2 pointer 00000000 Process WS page count 182
Global cluster 3 pointer 00000000 Global WS page count 50
PCB Specific Spinlock 814B3680 Subprocesses in job 0

Process index: 0058 Name: _TNA97: Extended PID: 00009058
--------------------------------------------------------------------

Thread index: 0000
------------------
Current capabilities: System: 0000000C QUORUM,RUN
User: 00000000
Permanent capabilities: System: 0000000C QUORUM,RUN
User: 00000000
Current affinities: 00000000
Permanent affinities: 00000000
Thread status: 22040003
status2: 00000001

KTB address 814B3240 HWPCB address FFFFFFFF.82AAC080
PKTA address 7FFEFF98 Callback vector address 00000000
Internal PID 00480058 Callback error 00000000
Extended PID 00009058 Current CPU id 00000001
State RWAST Flags 00000000
Base priority 4 Current priority 6
Waiting EF cluster 0 Event flag wait mask 00000001
CPU since last quantum 0E42 Mutex count 0
ASTs active NONE

3)

SDA> show process _TNA97: /chann

Process index: 0058 Name: _TNA97: Extended PID: 00009058
--------------------------------------------------------------------


Process active channels
-----------------------

Channel CCB Window Status Device/file accessed
------- --- ------ ------ --------------------
0010 7FF7E000 00000000 Busy MEKON8$DKA300:
0020 7FF7E020 81437B40 MEKON8$DKA0:[VMS$COMMON.SYSEXE]BACKUP.EXE;1
0030 7FF7E040 814BBB40 MEKON8$DKA0:[VMS$COMMON.SYSLIB]LIBOTS.EXE;1 (section file)
0040 7FF7E060 00000000 TNA97:
0050 7FF7E080 814BBAC0 MEKON8$DKA0:[VMS$COMMON.SYSLIB]LIBRTL.EXE;1 (section file)
0060 7FF7E0A0 00000000 TNA97:
0070 7FF7E0C0 814BEC00 MEKON8$DKA0:[VMS$COMMON.SYSLIB]DECC$SHR.EXE;1 (section file)
0080 7FF7E0E0 814BE700 MEKON8$DKA0:[VMS$COMMON.SYSLIB]DPML$SHR.EXE;1 (section file)
0090 7FF7E100 814C3E80 MEKON8$DKA0:[VMS$COMMON.SYSEXE]DCL.EXE;1 (section file)
00A0 7FF7E120 814BBA40 MEKON8$DKA0:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;195 (section file)
00B0 7FF7E140 814BD480 MEKON8$DKA0:[VMS$COMMON.SYSLIB]CMA$TIS_SHR.EXE;1 (section file)
00C0 7FF7E160 81448DC0 MEKON8$DKA0:[VMS$COMMON.SYSLIB]BACKUPSHR.EXE;1
00D0 7FF7E180 814BAE40 MEKON8$DKA0:[VMS$COMMON.SYSLIB]MOUNTSHR.EXE;1 (section file)
00E0 7FF7E1A0 814B9D80 MEKON8$DKA0:[VMS$COMMON.SYSLIB]DISMNTSHR.EXE;1 (section file)
00F0 7FF7E1C0 814BA8C0 MEKON8$DKA0:[VMS$COMMON.SYSLIB]MMESHR.EXE;1 (section file)
0100 7FF7E1E0 00000000 TNA97:
0110 7FF7E200 814C6780 MEKON8$DKA0:[VMS$COMMON.SYSMSG]SHRIMGMSG.EXE;1 (section file)
0120 7FF7E220 814C5DC0 MEKON8$DKA0:[VMS$COMMON.SYSMSG]DECC$MSG.EXE;1 (section file)
0130 7FF7E240 81448E80 MEKON8$DKA0:[VMS$COMMON.SYSMSG]SYSMGTMSG.EXE;1
0140 7FF7E260 00000000 TNA97:
0150 7FF7E280 00000000 MEKON8$DKA300:
0160 7FF7E2A0 00000000 Busy MEKON8$DKA300:

Total number of open channels : 22.

Please let me know if any other information is required.

Note: I have connected to the server using TELNET service. _TNA97: is that process. And I issued the backup command to take backup of some files in DKA300 disk.

Regards,
ajaydec
8 REPLIES 8
Volker Halle
Honored Contributor

Re: How to remove/stop RWAST process

ajaydec,

there is an IO outstanding on DKA300:. You have already tried STOP/ID and the Exec Mode rundown is active (ERDACT). The process cannot be deleted, if there are any outstanding IOs. What is the state of this disk ?

$ SHOW DEV DKA300:

If the disk is in mount-verification, you may be able to abort it with $ DISM/ABORT DKA300:

Or you can try an IPC interrupt to cancel mount-verification:

>>> D SIRR C
>>> C
IPC> C DKA300:
IPC> [ctrl/z]

For more details about the IPC interrupt see the System Managers Manual.

If everything fails, you will need to reboot.

Volker.
Not applicable

Re: How to remove/stop RWAST process

Hi,

Below is the output of show dev dka300 /full


$ show dev dka300 /full

Disk MEKON8$DKA300:, device type RZ28, is online, mounted, file-oriented device,
shareable, available to cluster, error logging is enabled.

Error count 3 Operations completed 783752
Owner process "" Owner UIC [SYSTEM]
Owner process ID 00000000 Dev Prot S:RWPL,O:RWPL,G:R,W
Reference count 5 Default buffer size 512
Total blocks 4110480 Sectors per track 86
Total cylinders 2988 Tracks per cylinder 16
Logical Volume Size 4110336 Expansion Size Limit 4112384

Volume label "V41_CMS" Relative volume number 0
Cluster size 4 Transaction count 5
Free blocks 889208 Maximum files allowed 411033
Extend quantity 5 Mount count 1
Mount status System Cache name "_MEKON8$DKA0:XQPCACHE"
Extent cache size 64 Maximum blocks in extent cache 88920
File ID cache size 64 Blocks in extent cache 83844
Quota cache size 0 Maximum buffers in FCP cache 1824
Volume owner UIC [SYSTEM] Vol Prot S:RWCD,O:RWCD,G:RWCD,W:RWCD

Volume Status: ODS-2, subject to mount verification, file high-water marking,
write-back caching enabled.


Is the disk got corrupted???

Regards,
ajaydec
Not applicable

Re: How to remove/stop RWAST process

Now I am not even able to access the disk.

I performed following two command in new terminal and now its got hanged up:

$ set def dka300:[000000]
$ show def
DKA300:[000000]

Its not returning to the prompt.

Can anyone help me how can i get rid of this trouble.
Volker Halle
Honored Contributor

Re: How to remove/stop RWAST process

ajaydec,

what does a simple $ SHOW DEV DKA300: return ?

There are 3 errors on this device. Try ANAL/ERR/ELV TRANSLATE/SINCE= to get an idea, if any errors have been reported on that disk while backup was running. You may need DECevent V3.4 ($ DIAGNOSE command) to decode those errlog entries.

Volker.

Volker Halle
Honored Contributor

Re: How to remove/stop RWAST process

ajaydec,

that why I asked for a SHOW DEV DKA300: first...

This may also be a problem with an XQP operation pending:

$ ANAL/SYS
SDA> CLUE XQP/ACTIVE
SDA> SET PROC/ID=58
SDA> SHOW PROC/LOCK

Is the first lock shown a WAITING lock ?

Volker.
Not applicable

Re: How to remove/stop RWAST process

Hi Volker,

I have attahced the output of following commands.
$ show dev dka300:
SDA> CLUE XQP/ACTIVE
SDA> SET PROC/ID=58
SDA> SHOW PROC/LOCK /brief

Will dismounting the disk and mounting it back again will solve the problem?

Regards,
ajaydec
Volker Halle
Honored Contributor

Re: How to remove/stop RWAST process

ajaydec,

the ELV output is useless, it does not even tell the device name. You would need to translate errlog.sys with either DECevent or SEA.

You cannot dismount a disk, if there are files open on it. You can't close the files, if there are IO operations pending on those files. Some XQP-IO from backup seems to be stuck inside the SCSI driver.

Seriously consider to reboot the system to recover. If you want to further analyze this problem lateron, force a crash (press HALT button then >>> CRASH).

Volker.
Hoff
Honored Contributor

Re: How to remove/stop RWAST process

As part of forcing a crash (while writing the dump file) and subsequently rebooting this box, do take this opportunity to inventory and to then install the most current mandatory patch kits for OpenVMS Alpha V7.3-2, the non-mandatory kits for those features you use (eg: networking), firmware for whatever storage controller(s) are in use here, and the patch kit for whatever version of IP is on this system.

Lost I/O and the resulting wedged processes tend to point to bad hardware, bad device firmware, or to any of various potential kernel-code bugs.

I'd also review the MEKON8$DKA SCSI chain.