- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: OPCOM cannot be stopped - KILL needed?
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2007 02:01 PM
тАО07-09-2007 02:01 PM
OPCOM cannot be stopped - KILL needed?
IA64 cluster of 2xRX2620s, running VMS V8.3. I haven't found out why yet, but OPCOM is running in a tight CPU loop. I cannot STOP/ID or STOP/ID/EXIT= or even kill it using a bit of macro that does a $forcex. There are no reads outstanding or IOs being clocked; the process is not reading its mailbox (so I've had to write a DCL routine to clear it out, otherwise other processes trying to communicate with OPCOM get a mailbox full error).
I HAVE managed to set the priority down to 0!!
Anyone know how I can kill this process? [Short of running OPCCRASH - I have a steel works attached to this cluster so really don't want to shutdown if I can help it, and next scheduled downtime is a week or so away!!]
Thanks,
PJ
Peejay
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If it can't be done with a VT220, who needs it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2007 04:18 PM
тАО07-09-2007 04:18 PM
Re: OPCOM cannot be stopped - KILL needed?
Can you see what it's doing? Or even what it thinks it's doing? If STOP/ID doesn't help, the process is most likely in an inner mode, or at AST level (which is blocking the $FORCEX AST).
Does SET PROCESS/SUSPEND help? Otherwise, take some CPU samples and examine the instruction streams (though that's not exactly easy on an integrity). If you're really desperate, you may be able to find something in memory you can change to break out of the CPU loop, otherwise it's reboot time!
On the other hand, if you can SUSPEND the process, or can tolerate it running at priority 0, you may be able to start up another OPCOM process to service the mailbox (that will probably take a manual RUN command to change the process name, and it depends on what, if any, exclusive resources OPCOM is holding).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2007 04:54 PM
тАО07-09-2007 04:54 PM
Re: OPCOM cannot be stopped - KILL needed?
No, cannot suspend process, and if I try to create another OPCOM manually it stack dumps with a 'device allocated to another user' error.
Ho hum.
Peejay
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If it can't be done with a VT220, who needs it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2007 04:56 PM
тАО07-09-2007 04:56 PM
Re: OPCOM cannot be stopped - KILL needed?
consider to elevate this problem to HP. As far as I remember, there still might be a problem causing an OPCOM loop and OpenVMS engineering is/was working on that last time I've heard.
You can easily obtain PC samples with the PCS$SDA extension:
$ ANAL/SYS
SDA> PCS ! for help
SDA> PCS LOAD
SDA> PCS START TRACE/PID=
...
SDA> PCS STOP TRACE
SDA> PCS SHOW TRACE
SDA> PCS UNLOAD
If you can't stop OPCOM, the loop must be in the image/process rundown code in the operating system itself and may therefore also possible affect other processes ...
Are you up to the current patch level ?
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2007 05:34 PM
тАО07-09-2007 05:34 PM
Re: OPCOM cannot be stopped - KILL needed?
As far as I am aware I am up to date, but will check...
Attached is PCS log, will attempt to log a call tomorrow (it's been too long a day to struggle with logging a support call now...).
Cheers,
PJ
Peejay
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If it can't be done with a VT220, who needs it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2007 05:48 PM
тАО07-09-2007 05:48 PM
Re: OPCOM cannot be stopped - KILL needed?
looping in LIBRTL !
This is the instruction reported most of the time in your PCS trace:
{ .mib
LIBRTL+001C8740:
cmp4.lt p6, p0 = r8, r0
mov r1 = r51
(p6) br.cond.dptk.few 1FFFFE0 ;;
}
So it 'looks' like a branch !!!
SDA> SET PROC OPCOM
SDA> SHOW CALL/SUMM
would report the call stack.
As far as I remember, this matches the symptoms engineering is/was working on...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-10-2007 03:25 AM
тАО07-10-2007 03:25 AM
Re: OPCOM cannot be stopped - KILL needed?
did you try STOP/ID=.../EXIT=mode ?
Start with USER, then SUPER, then EXEC, then KERNEL.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-10-2007 04:40 AM
тАО07-10-2007 04:40 AM
Re: OPCOM cannot be stopped - KILL needed?
There are kernel-mode tools around which allow clearing the NODELET flag, after which the process can be nuked.
nb: I'm not where I can check an existing OpenVMS OPCOM process PCB right now, to see if this PCB$V_NODELET flag is set for this process.
If the bit _is_ set, here's an example Really Big Hammer for this task:
http://mvb.saic.com/freeware/vmslt00b/vu/stop-i-mean-it-src.txt
This is kernel-mode code and it writes to the process PCB, with all the risks inherent.
Personally, I'd tend to let this process mimic the null process for a week or two, assuming this is a production server and it can be held together, pending a reboot or input from HP. If you need to use the RBH approach, I'd first test it on an OpenVMS I64 box off to the side.
Stephen Hoffman
HoffmanLabs LLC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-10-2007 04:44 AM
тАО07-10-2007 04:44 AM
Re: OPCOM cannot be stopped - KILL needed?
Process index: 0011 Name: OPCOM Extended PID: 22000411
--------------------------------------------------------------------
Process status: 00140001 RES,PHDRES,LOGIN
status2: 00000111 QUANTUM_RESCHED,TCB
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-10-2007 06:00 AM
тАО07-10-2007 06:00 AM
Re: OPCOM cannot be stopped - KILL needed?
someone in here had some code to set quotas,
one could kick down quotas and hope it goes
into rwast. but if its really in a tight
loop that might not work. Dean