Operating System - HP-UX
1754845 Members
5141 Online
108826 Solutions
New Discussion юеВ

Re: problem troubleshooting an application

 
Troy Nightingale
Occasional Advisor

problem troubleshooting an application

We have an application with a particular job that fails every evening with the following error:
Service Process [IntgrService_ahphp211] output error [/usr/lib/hpux64/dld.so: Mmap failed for the library : Not enough space. ].
I can't reproduce the error on my Development server....only in Production. Both servers are Itanium, 64 bit, Montecito, 11iv2.
It appears that this job fails most often when the machine is nearly idle.


Any ideas?
7 REPLIES 7
Steven E. Protter
Exalted Contributor

Re: problem troubleshooting an application

Shalom,

It appears that a file write or an attempt to swap a process from memory to disk is failing.

swapinfo -tam

I'd like to see if you are pushing limits.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Troy Nightingale
Occasional Advisor

Re: problem troubleshooting an application

Here is the swapinfo;
# swapinfo -tam
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 8192 0 8192 0% 0 - 1 /dev/vg00/lvol2
reserve - 6706 -6706
memory 48895 11790 37105 24%
total 57087 18496 38591 32% - 0 -
Dennis Handly
Acclaimed Contributor

Re: problem troubleshooting an application

>It appears that this job fails most often when the machine is nearly idle.

This is hard to believe. You get that error when you are out of swap space. If you are idle, you shouldn't be using much. In fact your swapinfo says you have 38 Gb free.

Of course you could have a bug in dld, you would try the latest dld patch: PHSS_37947

Do you more more in the message where it mentions the name of the shlib in question?
Troy Nightingale
Occasional Advisor

Re: problem troubleshooting an application

I am downloading the suggested patch and will test.
Troy Nightingale
Occasional Advisor

Re: problem troubleshooting an application

Well, I have applied the patches as suggested. We have a new message when it fails now.
Service Process [IntgrService_ahphp211] output error [/usr/lib/hpux64/dld.so: Cannot map data for library: mmap(0x0, 0xb2f0, 0x3, 0x42, 4, 0x0) returns Not enough space. ].

Don Morris_1
Honored Contributor

Re: problem troubleshooting an application

Ok, so dld is mapping the shared library using MAP_PRIVATE (good piece of data compared to the prior output). It isn't using MAP_FIXED, so the problem shouldn't be that another virtual object is in the way.

ENOMEM on a MAP_PRIVATE|MAP_SHLIB request stems from:

1) length being too large to architecturally fit in the process space. Length here is 0xb2f0, so not the problem.

2) Process hitting the rlimit (man setrlimit) for the total Address Space. Unlikely, since this defaults to Infinity and only if you change it [which would propagate to children] would it be lower.

3) Swap reservation failed [since this is MAP_PRIVATE] as noted. Already been discussed.

4) Process used mlockall(MCL_FUTURE) and insufficient lockable memory is available. Process would need to be privileged to have this occur, so you should have some idea if the process uses lockable or not. Even if the system is idle - since most folks are running Oracle and Oracle is a big lockable consumer for the SGA, this is a possibility. The swapinfo output doesn't lean that way, though -- since the memory swap gets stolen from as part of memory locking and you have plenty of memory swap.

5) Error when requesting the file be mapped from the Filesystem to VM. (Unlikely, especially for a shared library unless you've seen other filesystem errors and haven't run a fsck).

6) Insufficient kernel memory for metadata to describe the object. If the system is truly idle, I doubt this is the case.

7) Insufficient free private virtual address space. Since I think hpux64/dld.so [Dennis, feel free to correct me if I'm wrong here] implies this is a 64-bit process, I'd be more than a little surprised if this occurred... you'd have to eat an extremely large amount of private virtual address space to hit this on IPF (in the Pb range).

None of these seem terribly likely... so scanning patches -- do you have PHKL_37653 already? And if not, do you have PHKL_35448 and PHKL_35230? It doesn't look quite the same (since there's no use of MAP_FIXED on this particular call) but this could be a variant of QXCR1000581962 where a prior unmapping of the file isn't being recognized as reusable for this new private mapping. That's about the only thing that leaps to mind.

If you already have the PHKL_37653 patch (or are before the other two patches), I think we'd probably need tusc output and perhaps output via pstat_vminfo to get an idea of the process virtual layout. Attaching a program you could run to target the process in question (assuming it isn't dying just trying to start up -- in which case, just the tusc output is likely to be the best we'll get) before the dld / mmap. (Compile as cc +DD64 -o object_dump object_dump.c and run with -v -v -pid option... you'll need high verbosity to get the Virtual ranges for each object).
Troy Nightingale
Occasional Advisor

Re: problem troubleshooting an application

Thanks for all your help. I have updated patches, checked numerous kernel parameters, etc. I have now moved this into the vendors application support area since the only thing that changed from when things worked to when they didn't, was an application upgrade. They have now duplicated the issue and will hopefully implement a solution soon.