Re: Understanding 11.23 memory metrics

Ed Loehr · ‎07-12-2006

I'm trying to understand why we ran out of memory before I thought we would.

We have an itanium box running 11.23, 64GB of RAM. This box runs multiple heavily used PostgreSQL clusters. These DB clusters recently experienced multiple "out of memory" and "Deferred swap reservation failure" errors. We automatically sample glance's GBL_MEM_UTIL every 60 seconds, and the highest sampled value was 89.1% (i.e., ~7GB of RAM remaining). We do not think any process actually requested >7GB of RAM, though we could have overlooked something.

My question: Why did this occur so long before we got to 100% of RAM utilization according to GBL_MEM_UTIL?

Our dbc_max_pct is 10%, dbc_min_pct is 3%, and GBL_MEM_CACHE_UTIL is constantly at 10% (all the cache is being used).

We've off-loaded much of the memory demand present at the time of the errors. Here's the swapinfo -tam output from the moment:

$swapinfo -tam
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 4096 0 4096 0% 0 - 1 /dev/vg00/lvol2
reserve - 4096 -4096
memory 65524 41731 23793 64%
total 69620 45827 23793 66% - 0 -

TIA.

Ed

Bill Hassell · ‎07-12-2006

First, your dbc_max_pct is way, way too high for 64Gb. It means that your buffer cache is over 6Gb. The maximum recommended size is 1000 to 1200 megs. Anything more than 1500 megs and now the kernel starts burning a lot of cycles just to manage the large cache. Change your max pct values to 3 and 2.

About running out of memory. The first thing to verify is that the processes are allowed to grow as large as they want. maxdsiz and maxdsiz_64 control these fences. Start by increasing maxdsiz to 1700 megs and maxdsiz_64 to perhaps 4Gb or so. Then make sure there is no ulimit setting in a startup script that prevents a process from asking for the memory it needs.

Bill Hassell, sysadmin

Bill Hassell · ‎07-12-2006

Sorry, I overlooked your total swap area (which is only 4Gb). HP-UX needs to reserve swap (not actually use) space under several conditions, and swap space is also consumed by memory mapped files. I would add at least 20Gb of swap space and see if the errors go away. Glance will show the reservation area followed by the used area on the bar graphs. You can temporarily add the swap space with swapon (no reboot needed). If all works well, you can make the space permanent with an entry in fstab. Otherwise, the space is removed on reboot.

Bill Hassell, sysadmin

James R. Ferguson · ‎07-12-2006

Hi Ed:

Another way to look at this is that if you didn't have pseudoswap enabled ('swapmem_on' = 1), as evidenced by the "memory" line in the output of 'swapinfo', then you would need 64GB of device swap to utilize all of your physical memory. This is simply the rule of swap "reservation".

With pseudoswap enabled, 75% of your physical memory can be counted for the kernel's swap reservation requirement.

However, (.75 * 64) + 4 = 52 GB.

Thus, with only 4GB of device swap, you can only reserve ~ 52GB of process space, wasting about 12GB (64-52) of memory.

Regards!

...JRF...

Ed Loehr · ‎07-12-2006

Thanks, Bill, James.

I'd really like to understand a little more about cache sizing strategy.

I recall that disk access is typically 1000 times more expensive than memory access. I hear and accept that a large OS cache causes the kernel to burn a lot of cycles managing the cache.

The relevant piece of hard data I seem to be missing is the performance curve of "disk access time" vs. "cache access time" (where cache access time includes all cache mgmnt time) as the size of the cache increases. If the cache management time for a 6GB cache is still faster than the disk access time, then really, really large OS caches would seem to make a lot of sense up to the point where CPU bottlenecks become an issue.

Are you saying the sweet spot in *that* curve is 1000M-1200MB? Can anyone point me to any published research for HPUX for that curve?

TIA.

Ed

Ed Loehr · ‎07-12-2006

I hear the advice here that I just need huge DB caches and a small OS cache. That's contrary to the advice of the DB gurus; they advocate letting the OS do all the caching on the theory that it's better/faster than application shared memory. Maybe that's just not applicable to 11.23?

TIA.

Ed

A. Clay Stephenson · ‎07-12-2006

You aren't really going to find any published "sweet spot" data nor should you trust it if you did find any. It's one of those it depends answers --- and it really, really depends upon your hardware (e.g. cache-centric disk arrays) and your OS. I have never used your database software but with a few exceptions (one that comes to mind instantly is Informix-SE as opposed to their higher-end produnt Inform-Online) almost all databases benefit from huge shared memory areas in which the database engine does it's caching and then a much smaller OS buffer cache. As a general rule, the "sweet spot" on boxes with large amounts of memory somewhere in the 800-1600 MiB range and I would start at 1600MiB. In almost all cases, you are going to find that
a 100-300 deviation MiB from that value will not have huge impacts on performance. The benefits of larger UNIX buffer caches diminish because at much above this range the searches within the buffer cache begin to take more and more time. (One of the benefits of the 11.11 and up versions of HP-UX is that the buffer cache searches are hashed and are thus more efficient than their earlier counter parts. As a comparison the 10.20 sweet-spot was much nearer 400MiB -- although again, you had to measure. You should also note that most disk drives and certainly all most all disk arrays also have on-board caches which also buffer the i/o and this tends to diminish the role of very large buffer caches as well.

The only way to really determine the sweet spot for your platform, disks, and application is to measure but the values that have been suggested will be rather close to optimum.

If it ain't broke, I can fix that.

Ed Loehr · ‎07-15-2006

Sorry, A. Clay, didn't mean to give you zero points, not sure what happened there. Your answer was helpful. Thanks.

Steven E. Protter · ‎07-15-2006

Shalom Ed,

I think you will find that reducing dbc_max_pct to 3 or 4 % and allocating memory to the settings of the postgres database will be helpful.

I've found in general, if the database has memory settings like the Oracle SGA you will do better increasing the settings there and decreasing the OS buffer cache.

Database settings are a finer tool for the job, the OS being crude and general.

At this point, even though your Postgres database clusters are heavily used, your system does not seem to be under any stress.

Check vmstat and these tools.

http://www.hpux.ws/system.perf.sh

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

A. Clay Stephenson · ‎07-15-2006

No problem Ed about the points although I did wonder what I could have possibly said to have made you mad --- since I was nicer in this reply than I am in others. About the only guys and gals that I don't have much patience with are those that are lazy.

If it ain't broke, I can fix that.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Understanding 11.23 memory metrics

Understanding 11.23 memory metrics