cancel
Showing results for 
Search instead for 
Did you mean: 

mlock() memory consumption

SOLVED
Go to solution
Pal_5
Advisor

mlock() memory consumption

I wrote simple program (mlock.c) which allocates and mlocks 512 MB RAM. This is:
#include

main() {
char *addr;
int flags=MAP_PRIVATE|MAP_ANONYMOUS;
addr=mmap( 0, 512 * 1024 * 1024, PROT_READ|PROT_WRITE, flags, -1, 0 );
mlock( addr, 512 * 1024 * 1024);
pause();
}

Before I run swapinfo showed:
> swapinfo -mat
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 8192 0 8192 0% 0 - 1 /dev/vg00/lvol2
reserve - 2175 -2175
memory 24468 7454 17014 30%
total 32660 9629 23031 29% - 0 -

Then I started the compiled mlock.c and checked the memory consumption by swapinfo again.

> swapinfo -mat
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 8192 0 8192 0% 0 - 1 /dev/vg00/lvol2
reserve - 2703 -2703
memory 24468 7966 16502 33%
total 32660 10669 21991 33% - 0 -

So total used increased from 9629 MB to 10669 MB the delta is 1 GB, twice the allocated and mlocked memory.

Is it OK?
If memory locked into memory why does the system allocate the same amount for pseudo swapping?



12 REPLIES
Dennis Handly
Acclaimed Contributor

Re: mlock() memory consumption

>why does the system allocate the same amount for pseudo swapping?

It has to use up pseudo-swap, that memory can't be used elsewhere. It seems it also reserves device swap. Perhaps because you did a mmap(2) before the mlock(2) and that reserved it.
Pal_5
Advisor

Re: mlock() memory consumption

> Perhaps because you did a mmap(2) before the mlock(2) and that reserved it.


???????
How to mlock anything _before_ allocation? In this case HP-UX _must_ deliver SIGSEGV. But the issue does not depend on mmap().
The same problem has been observed with shmget() also.


Dennis Handly
Acclaimed Contributor

Re: mlock() memory consumption

>How to mlock anything _before_ allocation?

Not sure if you can? mmap(2) has
MAP_NORESERVE.

>But the issue does not depend on mmap(). The same problem has been observed with shmget() also.

Right, both get the space, then later mlock(2) it. And since you can use munlock(2) or only lock part of it, it seems the kernel requires swap for the whole thing.
Pal_5
Advisor

Re: mlock() memory consumption

>How to mlock anything _before_ allocation?

You must have _valid_ (mapped) address (pointer in C) for mlock. So mmap must be called _before_ mlock.

I guess the issue is a simple example for HP programmers' lazziness.

When you allocates (with RESERVE for Dennis) memory in HP-UX by mmap() (or else) then the free VM is decremented and reserved incremented.

Later when it is mlocked then free _memory_ is decremented and used _memory_is incremented again.
In this case physical memory consumption also should be adjusted. But the counter for reserved memory is _not_ decremented, somebody has forgotten to write this part of code.

So HP-UX VM handling code is not clever enough and this is a classic "the left hand doesn't know what the right hand is doing" situation. Quite funny in case of 25+ years old operating system.
Pal_5
Advisor

Re: mlock() memory consumption

> In this case physical memory consumption also should be adjusted.

Sorry, not physical but the _reserved_ should be adjusted.

No memory allocation/reservation has been taken place by mlock so overall (total used in swapinfo -t) must be the same.

Dennis Handly
Acclaimed Contributor

Re: mlock() memory consumption

>I guess the issue is a simple example for HP programmers' lazziness.

It depends on the design that was implemented. If you want more details on the design tradeoffs you'll need to wait until Don chimes in or talk to the Response Center.

>Later when it is mlocked then free memory is decremented and used memory is incremented again.

That's correct, this space can't be used for pseudo swap.

>In this case reserved memory consumption also should be adjusted.

(You mean reserved swap or reserved VM.)

>But the counter for reserved memory is not decremented, somebody has forgotten to write this part of code.

No, the reserved swap should be there in case munlock(2) is done. You wouldn't want munlock to fail due to out of swap or to fragment the swap area due to partial munlocks?

>No memory allocation/reservation has been taken place by mlock so total used in swapinfo -t must be the same.

But a chunk of RAM has been consumed/locked and is no longer available.

Have you tried mmap(MAP_NORESERVE)?
Pal_5
Advisor

Re: mlock() memory consumption

>No, the reserved swap should be there in case munlock(2) is done. You wouldn't want munlock to fail due to out of swap or to fragment the swap area due to partial munlocks?

Why? In case of munlock() the used memory has to be decreased and the reserved has to be increased in the same time. The total is not changed.

So the accounting/bookeeping scenario for mlocking 1 GB:
mmap(1GB) -> reserved + 1GB
mlock(1GB) -> memory + 1GB, reserved - 1GB
munlock(1GB) -> memory - 1GB, reserved + 1GB
munmap(1GB) -> reserved - 1GB

Same for shmget/shmctl calls.
In mlock and munlock does not change the total VM consumption ie. disk+reserve+memory swap total.

> Have you tried mmap(MAP_NORESERVE)?

Why? Do you think that oracle use NORESERVE when allocates shared memory for SGA? Is there any hidden oracle parameter for "noreserve" when mlocked SGA is used?

So if I have a system with 16 GB RAM, 4 GB swap and pseudo swap is used then the maximum mlocked SGA can be 10 GB?


Dennis Handly
Acclaimed Contributor

Re: mlock() memory consumption

>In case of munlock() the used memory has to be decreased and the reserved has to be increased in the same time.

What if you are out of swap space? You may not have any to reserve. How do you handle VM fragmentation for mlock/munlock?
The simple way to implement these is by not giving up the device swap reservation.

>So if I have a system with 16 GB RAM, ...

Again, you need to contact the RC or wait for Don.
Pal_5
Advisor

Re: mlock() memory consumption

> What if you are out of swap space?

It can also happen if mlocked memory reservation counted twice (reserved and memory). In fact, the VM will be exhausted much earlier.
An in this case do you want to unlock the oracle SGA for freeing the memory swap space counted for mlocked SGA? It sounds funny. And how do you want to achieve this?

> Again, you need to contact the RC or wait for Don.

Contacting RC? And waiting for L2-L3-...-LN level escalations? It is the last option.
Who is Don? When will he be available?

Dennis Handly
Acclaimed Contributor

Re: mlock() memory consumption

>Contacting RC? And waiting for L2-L3-...-LN level escalations? It is the last option.

That's the only way you'll get a definitive answer. (Especially since you are awfully lax with assigning points.)

>Who is Don?

A kernel engineer. I dropped him a pointer.
Don Morris_1
Honored Contributor
Solution

Re: mlock() memory consumption

So, a few comments:

1) Just as a quick note - MAP_NORESERVE doesn't help things any here. The lazy swap reservation gets the swap in the middle of the mlock path but gets treated the same way as the swap reservation in the first place -- if disk swap is available, it will reserve there [even though the lockable stole from swapmem anyway, yes].

2) I'm not going to dig through every design note ever on this to say with 100% certainty why it is the way it is. There are benefits to this design in that it ensures simple and clean backout operation (without risk of resources being unavailable in operations you can't fail for lack of resources [like munlock() or more interesting munmap() or exit() -- and before you say "Well, why would you need resources if you're going away?" keep in mind a single mlock on a Shared object which has many other attached references means objects don't always go away on a single exit()]. It also simplifies the model with regards to having to track what is swap-backed versus what isn't (private page copies within a file versus file-backed shared pages, etc.).

3) One other factor here that is less obvious is that preserving memory swap for lockable memory clients and most importantly for kernel memory allocations is a higher priority than preserving swap reservation space as a general rule. (If you run out of memory swap and need more kernel space to make progress, you're out of luck - so the system tries to avoid that). This is why memory swap is reserved after all disk/FS swap reservation space is used and why the system will actually migrate memory swap backed reservations to disk/FS backed reservations where it can over time if the swap needs peak and then recede.

4) Harkening back to (1) for a moment, the system is smart enough not to double-book memory swap, however -- as if the object is already using memory swap, there's no real gain in holding more as long as it isn't incorrectly released. If there were a mmap() or shm or whatnot flag for forcing memory swap reservations, you'd see your reservations be only what you need. Since there isn't, of course, you can't see that unless you want to do an experiment where you consume all your disk swap first. Since this wouldn't be reliable in practice -- I wouldn't recommend you bother.

5) So, after all those comments and back on point -- while I can see why it was done this way and wouldn't call it just "laziness" (there's a clear factor of diminishing risk in this sort of complex accounting) -- I would personally agree that this is suboptimal and the system can do better. As such, I'd ask that you do suffer through the Support process enough to get them to log this as an actual customer found issue (it is good to have a real customer on this sort of thing so the trade offs between the risk of changing the behavior and the benefits of resolving the issue can be weighed correctly).
Pal_5
Advisor

Re: mlock() memory consumption

?