- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- MNOWAIT link - undefined symbol
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 04:19 AM
03-05-2007 04:19 AM
MNOWAIT link - undefined symbol
I've downloaded:
http://vmsone.com/~decuslib/vmssig/vms94a/dsj/mnowat.dsj
But can't resolve nor find info on how to
resolve EXE$NAMPID:
$ show system/noproc
OpenVMS V7.3-2 on node AANODE 5-MAR-2007 12:06:40.43 Uptime 23 22:00:47
$ macro mnowait
$ link/sysexe mnowait
%LINK-W-NUDFSYMS, 1 undefined symbol:
%LINK-I-UDFSYM, EXE$NAMPID
%LINK-W-USEUNDEF, undefined symbol EXE$NAMPID referenced
in psect $LINKAGE offset %X000000C0
in module .MAIN. file DVLP$DISK:[AANODE.YOUNGR.WORK]MNOWAIT.OBJ;3
What does it take to link this correctly?
What problem am I trying to solve? A process
stuck in a tight CPU loop that thinks it has
a pending buffered IO to run down?:
Process index: 02EC Name: AUSER Extended PID: 204052EC
--------------------------------------------------------------------
Process status: 02040003 RES,DELPEN,PHDRES,INTER
status2: 00000001 QUANTUM_RESCHED
PCB address 82C8F4C0 JIB address 82ECF0C0
PHD address 87420000 Swapfile disk address 00000000
KTB vector address 82C8F7AC HWPCB address FFFFFFFF.87420080
Callback vector address 00000000 Termination mailbox 003B
Master internal PID 000A02EC Subprocess count 0
Creator extended PID 00000000 Creator internal PID 00000000
Previous CPU Id 00000000 Current CPU Id 00000001
Previous ASNSEQ 000000000008A63C Previous ASN 000000000000004F
Initial process priority 4 # open files remaining 150/150
Delete pending count 0 Direct I/O count/limit 150/150
UIC [02040,000024] Buffered I/O count/limit 149/150
Abs time of last event 01A32F07 BUFIO byte count/limit 100000/100000
# of threads 1 ASTs remaining 250/250
Swapped copy of LEFC0 00000000 Timer entries remaining 10/10
Swapped copy of LEFC1 00000000 Active page table count 0
Global cluster 2 pointer 00000000 Process WS page count 149
I'm trying to avoid a reboot as this is mostly
an annoyance (user is pegging whatever CPU
it is running on). The process on show users/full shows the NTY as disconnected
(perhaps that is the BIOLM mismatch? I've
not seen Multinet not rundown a process
correctly, by the way.)
Reading here:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1025937
has me going this route. I'm not fond of
going the DELTA route.
Thanks,
Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 04:31 AM
03-05-2007 04:31 AM
Re: MNOWAIT link - undefined symbol
> What does it take to link this correctly?
Link it on a VAX ;-)
This program seems to be written for a VAX. The EXE$NAMPID internal entry point does not exist on OpenVMS Alpha. You would not want to trust such a program, as it will probably crash the system anyway.
The looping process should have one outstanding IO (busy channel): SDA> SHOW PROC/CHAN
In the mean time, set the priority of the looping process to 0, this will minimize the effect on the rest of the system.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 04:34 AM
03-05-2007 04:34 AM
Re: MNOWAIT link - undefined symbol
> Link it on a VAX ;-)
Wow - missed the obvious. One thing I thought
was "2006, must be AlphaVMS discussion."
Who'da thought it?
Yes - one channel open:
Process index: 02EC Name: NEUBAUERLA Extended PID: 204052EC
--------------------------------------------------------------------
Process active channels
-----------------------
Channel CCB Window Status Device/file accessed
------- --- ------ ------ --------------------
0010 7FF6E000 00000000 DSA1300:
Total number of open channels : 1.
So.. short of a reboot, what does it take
to run this process down?
Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 04:40 AM
03-05-2007 04:40 AM
Re: MNOWAIT link - undefined symbol
The replacements for the various routines are in the family:
EXE$NAM_TO_PCB
EXE$CVT*
At least some of these are in the device driver book, and the rest are in the source listings.
I'd look for other changes needed in the MNOWAIT code, too.
DECamds can clear various wedged processes now.
And what's driving the NTY loop? I'd look at that in some detail -- that could be a lost I/O and not a mutex, for instance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 04:44 AM
03-05-2007 04:44 AM
Re: MNOWAIT link - undefined symbol
if everything looks otherwise o.k. with the DSA1300: shadowset, this must be a lost IO.
Probably time for a reboot...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 04:46 AM
03-05-2007 04:46 AM
Re: MNOWAIT link - undefined symbol
I found EXE$NAM_TO_PCB here:
SYS$LDR> dir/date/size=all pro*.stb
Directory SYS$COMMON:[SYS$LDR]
PROCESS_MANAGEMENT.STB;1
86/105 8-FEB-2006 13:49:23.10
PROCESS_MANAGEMENT_MON.STB;1
93/105 8-FEB-2006 13:49:29.11
Certainly didn't want to nor capable of
mucking with the .mar code so kept hunting
for the elusive EXE$NAMPID.
Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 04:49 AM
03-05-2007 04:49 AM
Re: MNOWAIT link - undefined symbol
> if everything looks otherwise o.k. with the DSA1300: shadowset, this must be a lost IO.
> Probably time for a reboot...
I was afraid of that. I hate that. Now
it is a change control, etc.
I read the other discussions... it sure would
be nice if there was a way to BLAST away
processes like this without corrupting the
IO database (whatever)
or bringing the system to a grinding halt.
Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 05:12 AM
03-05-2007 05:12 AM
Re: MNOWAIT link - undefined symbol
this process is looping, because of some previous SEVERE internal error (most likely either in the shadowing code or the IO sub-system).
OpenVMS thus allows you to get notified of this error in a 'friendly' manner, which does not immediately take down the whole system, but just loops this single process. So when you have a chance and a support contract, you could force a system crash and escalate the problem to OpenVMS engineering for analysis.
Would you have preferred OpenVMS to just ignore the error and continue ? And then crash sometimes later ? Or develop some other nasty and unpredictable behaviour due to this problem ?
No program can securely 'kill' such a process and 'un-do' the previous malfunction. It might be possible to manually get rid of this process after thorough analysis and risk assessment, but a reboot is cheaper...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 05:35 AM
03-05-2007 05:35 AM
Re: MNOWAIT link - undefined symbol
> this process is looping, because of some
> previous SEVERE internal error (most likely > either in the shadowing code or the IO
> sub-system).
Hmmm... I have these relevent patches
in place:
DEC AXPVMS VMS732_FIBRE_SCSI V9.0 Patch Install 17-SEP-2006 02:45:23
DEC AXPVMS VMS732_UPDATE V7.0 Patch Install 17-SEP-2006 02:44:58
Perhaps something has been fixed since.
But I'm not so sure how severely broken
that level of UPDATE/FIBRE_SCSI could be
(after all, 7 and 9 go-rounds of patches
there).
>OpenVMS thus allows you to get notified of >this error in a 'friendly' manner, which >does not immediately take down the whole >system, but just loops this single process. >So when you have a chance and a support >contract, you could force a system crash and >escalate the problem to OpenVMS engineering >for analysis.
I have both and will do prior to reboot.
Further, here's my take on the "severity"
of such an issue. I've run up and down
many processes, you tell me if typically
a PID is 2021C985 , that's 100000+ processes
that have run by (a cluster). So I've
got one looping.
>Would you have preferred OpenVMS to just
>ignore the error and continue ? And then >crash sometimes later ? Or develop some
> other nasty and unpredictable behaviour due >to this problem ?
No. However, I guess I'm looking forward
to the day when each process runs inside
its own Virtual Machine so I can just down
the Virtual Machine. Don't know which OS,
don't really care at this point. As long
as I don't have to be awake at 2 a.m. for
a 7x24 mission critical application just
to clear a freakin' process.
>No program can securely 'kill' such a
>process and 'un-do' the previous
>malfunction. It might be possible to >manually get rid of this process after >thorough analysis and risk assessment, but a >reboot is cheaper...
A reboot is cheaper?
For many of us perhaps, not for me and
other folks I know (7x24x365 and you
fight for a maintenance window. Yes,
fight).
Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 06:01 AM
03-05-2007 06:01 AM
Re: MNOWAIT link - undefined symbol
I had the fleeting impression we were in the same kind of circumstances (NEVER down), but
>>>
7x24x365
<<<
has me wondering.
Are you on a special 7-year cycle?
Or did you mean 7 * 365 (still leaves the leap-day every 4 years)
or 24 * 7 * 52 (wow! one day every year!)
We usually define our operation as 24 * 365.25 :-)
But seriously, in those circumstances, you REALLY need AM/AMDS, or get at ease with DELTA.
And, rolling reboots should be a piece of cake.
fwiw,
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 06:21 AM
03-05-2007 06:21 AM
Re: MNOWAIT link - undefined symbol
> 7x24x365
I'm bellyachin. I can probably get
by with a Friday evening thing as I'll
bring one node down (the node with
the spinning process) and
maintain application up.
Fortunately, this node
has none of the interfaces running. I still
have the pain of change control and doing
the work, family interruptions, etc.
But in general, we've got to get past this
thing. I see the attraction of VMWare (et al)
to make up for the weakness of certain OSes.
We're stuck with VMS when the machine gets
wedged. What would be cool is if we could
do these VMs, I'd isolate a "machine" for
all the important interfaces , users would
run in multiple VMs. If a process got
wedged/hung/spinning , I'd load balance
everyone off that machine , and reboot in
some far off future (maybe go from 8 to
7 machines in the process).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 08:27 AM
03-05-2007 08:27 AM
Re: MNOWAIT link - undefined symbol
In the interim, OpenVMS Galaxy might be of interest, depending on the particular box.
The upper field of a PID is the cluster system id. If you've got big values in the low bits, you have numbers of processes churning, or a whole lot of reuse of a PCB. A looping process doesn't chew up PID numbers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2007 06:12 PM
03-05-2007 06:12 PM
Re: MNOWAIT link - undefined symbol
all of this is software and software has bugs, if you like it or not. Some OSes may have more bugs than others. Some OSes may be harder to diagnose in case of problems than others. Some OSes will spread problems into other areas within the OS without telling you.
I believe OpenVMS is pretty good at detecting problems and preventing them from spreading. OpenVMS also provides extremely good tools and structures, to allow you to diagnose a problem such as you are seeing. And it even contains tools - such a DELTA - to allow you to 'fix' such a problem with minimal risk and impact, if you - or someone else - has the required OpenVMS internals knowledge to analyze and diagnose the extent of the problem in the running system.
Setting the process priority to 0 limits the impact of this problem to your system and gives you time for diagnosis.
If there is an outstanding IO, one needs to find the IRP (IO Request Packet) somewhere in pool, decode the function bits and find out, what might have happened. If this IO is not in any pending lists in the system, it may be possible to clean up the process-related IO data structures and allow this process to successfully run down. I have down this before and I've shown an example in a DECUS presentation some years ago. So it is possible after thorough analysis, but not by running some 'magic tool'.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2007 04:32 AM
03-07-2007 04:32 AM
Re: MNOWAIT link - undefined symbol
---
If you are willing,
Let's do this exercise here.
If nothing else it helps others.
Where to start?
Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2007 04:49 AM
03-07-2007 04:49 AM
Re: MNOWAIT link - undefined symbol
There are certainly advantages to VMs, but VMs are not a panacea. Just as processes can get messed up due to bugs, a VM can get messed up. And if a VM gets messed up or itself needs some sort of ECO or maintenance or just a reboot, there's a whole lot more affected.
---
I had about a 45 minute discussion (should have
been 10) with someone that works with VMWare
day to day. Granted, ESX server can get wedged, but it is a rare thing. VMWare
attractiveness and explosion is due to ROI
- very easy to prove (many Windows servers
are spinning 10% CPU on average). It took
me a good 20 minutes to try to get the guy
to understand how exactly it could help me,
as ROI isn't a concern when you're banging
the CPUs.
Something like this:
Take a physical server , make 4, 8 guest
OSes. Now when I have a run-away process,
load balance everyone off that box. Perhaps
use VMWare to tell that OS it is only getting
2% of a physical CPU so the process could
spin to its heart's content. Some future
date (way in the future it need be, or during
the day, whatever) I reboot the problem OS.
Galaxy isn't an option. Fast-forward...
hypervisors aren't so bad now (they can do
things like limit a host OS 2% of a physical
CPU) with their 4-5% overhead hypervisors
aren't a nusance with CPUs gaining speed
so quickly. Not unlike the whole TOE
for iSCSI gotcha... lost in the noise.
No more TOE discussions!
Of course my pissy co-worker "but when 10 gig
comes along, it becomes more of an issue."
and round and round we go.
Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2007 11:45 AM
03-07-2007 11:45 AM
Re: MNOWAIT link - undefined symbol
Dont waste time reboot the system anyway your work is appriciable but without reboot this process cannot be kill.
This is bug and Hp trying to give solution for all bugs from time to time for better and smooth works.
thanks
Atul sardana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-08-2007 06:26 PM
03-08-2007 06:26 PM
Re: MNOWAIT link - undefined symbol
to find the IRP for the pending IO operation on DSA1300:, you need to start as follows:
$ ANAL/SYS
SDA> READ SYSDEF
SDA> SHOW DEV DSA1300:
I/O data structures
-------------------
DSA1300 RZxx UCB: 8xxxxxxx
...
Search for this address in nonpaged pool (replace 8xxxxxxx with the real UCB address of DSA1300):
SDA> SEARCH @mmg$gl_npagedyn:@MMG$GL_NPAGNEXT 8xxxxxxx
...
Match at FFFFFFFF.8yyyyyyy 8xxxxxxx
...
For every match found, issue the following command:
SDA> EXA 8yyyyyyy-irp$l_ucb;10
and post the data found. We will be looking for an IRP in nonpaged pool, which points back to the UCB of DSA1300. The 3rd longword in this IRP should contain: 000A0240
Note that this could be a lengthy process through this forum. And it's not guaranteed to be successful. But as long as you need or want to keep this system up with the looping process, we can continue troubleshooting...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-08-2007 11:51 PM
03-08-2007 11:51 PM
Re: MNOWAIT link - undefined symbol
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2007 01:59 PM
03-09-2007 01:59 PM
Re: MNOWAIT link - undefined symbol
---
Of course, there is UCB$L_IRP, as well as UCB$L_IOQFL, which may or may not be relevant.
For a SCSI disk device using DKDRIVER (a driver that can handle more than one I/O at at time), UCB$L_IRP is explicitly set (whilst holding the correct synchronisation -- typically the fork lock of the UCB) during I/O setup and completion. If there is more than one I/O in the driver at a time, UCB$L_IRP can point to any of them, so the odds of it pointing to the "lost" I/O for a busy device is slim.
-- Rob (who used to spend a fair amount of time worrying about these things)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2007 02:35 PM
03-09-2007 02:35 PM
Re: MNOWAIT link - undefined symbol
I've also seen cases where there is no lost IRP, but a case where a counter increment-decrement sequence was entangled, and a decrement was lost. This case can arise in driver error handling and related quota processing.
I've been known to brute-force this correction, and bomb the count into a semblance of correctness. This mid-flight kernel correction may or may not have a beneficial effect toward the eventual achievement of long-term system stability, as they say. The IRP itself is lost.
If these IRPs are disappearing with any sort of regularity, it's time to have a careful look at the kernel or driver or other related code involved.
If it's not your code, check first for ECOs, then contact the vendor.