System Administration
Showing results for 
Search instead for 
Did you mean: 

HPUX 11.3 swchunk & pseudo swap sanity check.

Go to solution
Honored Contributor

HPUX 11.3 swchunk & pseudo swap sanity check.

I've got a vm with 32GB of ram allocated to it. Physical swap is 4GB. However, I don't see it enabled via kernel parameter like HPUX 11.11. It seems 2b gone, and I guess it's just on by default now.

$> swapinfo -tam
dev 4096 0 4096 0% 0 - 1 /dev/vg00/lvol2
reserve - 796 -796
memory 31163 4741 26422 15%
total 35259 5537 29722 16% - 0 -

So, the above looks right.

In the kernel right now, swchunk is at the default of 2048.

So, according the man page ...
(Aside: don't you just love the man pages accessible from the SMH kernel tunables param tool with actual topic tips and discussions? It shows me the commands right on screen so I can make a script? MAN, how cool!) ...
So, back to the man page.
It says that max swap would be 2GB*swchunk*1kB -> that's 2TB! Meaning, it will be a while before I need to set swchunk larger to get my swap setup again (<10 yrs by Moores law) like I did in 11.11.

So, just a point of order with swchunk being at default of 2048 - does my swapinfo -tam look correct for a 32Gb server?

I think it's fine, but I'm not used to setting up a server this large with this little bit of swap... and having not seen any of these but our own ones, I'm just looking for a little confidence with some input. I also don't see my comfortable "at least 1.5 times memory" value that I like as output of the total line from swapinfo -tam.

Does that 2TB figure I got from the man pages via calculation look a bit high to anyone? Seems like a lot, especially since the man pages I mentioned even say that setting this unnecessarily high takes up kernel memory for the structure.

Confirmation? Advice? Your own best practices?

Advice/Review appreciated.
We are the people our parents warned us about --Jimmy Buffett
Emil Velez
Honored Contributor

Re: HPUX 11.3 swchunk & pseudo swap sanity check.

swchunk is the size of a chunk of swap space
maxswapchunk is the maximum number of swapchunk pieces the OS can manage. So the max swap you can have is swchunk*maxswapchunk

WIth 11.31 and maybe 11.23 these parameters are depreciated or deleted.

The swapinfo looks right since you configured 4 GB of real device swap. if maxswapchunks or swchunk was too small you would not see all of the swap avaialble.

Your not even using the swap yet since you probably have plenty of memory left avaible.

The amount of swap space really depends on whether your virtual memory usage will be close to or exceeding your physical memory.

Many customers with 32 GB of memory but still having 4-5 GB free normally may configure 8 GB of swap space since they will always have some memory free.
Honored Contributor

Re: HPUX 11.3 swchunk & pseudo swap sanity check.

Emil - you're restating what I used to go by. If you read the new man pages for this parameter for HPUX 11.3; you'll see that I've posted is exactly what the manual says. And, since it seemed so new to me (and different), that's why I posted the question. And... that's how we got here to this spot today...
We are the people our parents warned us about --Jimmy Buffett
Don Morris_1
Honored Contributor

Re: HPUX 11.3 swchunk & pseudo swap sanity check.

Actually, you're misquoting the man page a little bit. Your formula is right -- but swchunk defaults (and has a minimum of) to 2048, not 1kb. So that's 2Tb*swchunk not 2Tb total. By default, v3 supports 4Pb of swap with no modification (and _that's_ what the man page states).

And yes -- that's all accurate.

Regarding the memory consumption part -- yes, it is true [as the man page states] that a larger value of this tunable means a larger cost for the tables it sizes (which are the second level of the swap tracking). However, there's a tradeoff -- if you set this higher, you also need fewer tables by definition -- and the first level table is demand-grown to cover the maximum amount of swap present over the life of the system (this is what maxswapchunks used to control).

So for an example -- take a system which is has 4098Mb of swap. The default value of swchunk gives a table coverage at the second level of 2Mb (2048*1Kb), so the swap subsystem needs 2049 of these tables. Each table costs 8 bytes per entry so each table costs 2048*8 bytes or 16,384 bytes. The total table cost is 33,570,816 bytes. [Please don't assume those numbers will never change, of course....]. Additionally, that means the first level table must be able to address all 2049 entries, and the first level entry cost is 64 bytes. So that's an additional 131,136 bytes. Total cost at the default is 33,701,952 bytes.

Now -- what if swchunk were 32768 instead? (32Mb per table). Now we need 128.0625 entries... which means 129 entries (since a partial entry isn't allowed). Each table spans the 32768 range, so each table is (32768*8) in cost or 262,144 bytes. For 129 tables, that's a cost of 33,816,576 bytes. Adding in the top-level table cost (129*64) gives 33,824,832 bytes. This represents an increased cost of 122,880 bytes.

If this had not required partial tables, however (say the total swap was 4096Gb which requires exactly 128 tables for the 32768 swchunk or 2048 tables with the default swchunk). That gives table costs of (128*262144 = 33554432) and (2048*16384=33554432). Which isn't surprising.. you're representing the same exact space (no partial tables) so you have the same cost. The first level cost, however, is (64*128 = 8192) with swchunk of 32768 and (64*2048 = 131072) with the default.

Now, you've saved 122,880 bytes with the higher value of swchunk. The key to the most savings is to have all your disk and/or FS swap to be a strict multiple of swchunk so there are no partial tables, enabling maximum savings at the first level and keeping your costs the same for the second level.

There's more to it than just this -- second level entries don't cross devices/filesystems, for example - so you can get partial tables even if the total swap is "aligned" with swchunk... this is just the basics.

In practice and in most cases -- this is a lot of work and planning for very little benefit. The higher levels of swchunk are more likely to result in partial tables and the partial costs are likely to outweigh the savings at the first level. Hence the defaults are to favor smaller second level tables (which gives more likelihood of not wasting space due to non-aligned swap areas on a per-disk/per-FS basis), and since this still enables a lot of swap on v3, it generally isn't worth changing.

All that said, I can't really comment on how much swap you should configure for your 32Gb server. It really depends on what you plan to run on it that isn't file backed (as that doesn't require swap reservations). If your workload fits comfortably in 35Gb or so of virtual address space, you're fine.

Don Morris_1
Honored Contributor

Re: HPUX 11.3 swchunk & pseudo swap sanity check.

Sorry, that's a 98304 byte savings in the 4096Gb case. Must have had a stale stack entry in my calculator I mistakenly called up or something.
Honored Contributor

Re: HPUX 11.3 swchunk & pseudo swap sanity check.

Good summary - thanks Don!

Except, of course, when you say I misquoted:

From the man page:

Second Paragraph of the Description:

swchunk controls the size in physical disk blocks (which are defined
as 1 KB) for each chunk. The total bytes of swap space manageable by
the system is swchunk * 1 KB * 2,147,483,648 (the system maximum
number of swap chunks in the swap table). Note that the minimum (or
default) value of swchunk therefore allows 4,096 TB of swap space.

And again in another section:

When Should the Value of This Tunable Be Lowered?
If the amount of swap space mappable by the system is much larger than
the total amount of swap space which is attached (or going to be
attached) to the system, which is calculable by multiplying
2,147,483,648 * swchunk* 1 KB, then kernel memory usage can be reduced
by lowering swchunk to fit the actual swap space.

Yep, there it is, just like I said.

Of course, it looks like my math is bad as well, because 2G*swchunk*1K is NOT 2TB, well, only if swchnk is "1" - and it's not.
But, I believe when I wrote the thing was that I was thinking that it was 2TB per chunk, and I left the "per chunk!" piece of my statement off. Which just made a mess of the math. :-)

Anyways, I got what you're telling me, and I sure appreciate your great advice. Looks to me like I'm just gonna leave it at the default.

Many thanks,

We are the people our parents warned us about --Jimmy Buffett
Honored Contributor

Re: HPUX 11.3 swchunk & pseudo swap sanity check.

Thanks Don!
We are the people our parents warned us about --Jimmy Buffett
chris huys_4
Honored Contributor

Re: HPUX 11.3 swchunk & pseudo swap sanity check.

Hi John,

Im not sure how in the event of a crashdump, a system with a swapdevice of 4Gbyte will, be able to save even a selective crashdump, and certainly not a full crashdump. (check crashconf -v)

So I would still always, add a second, seperate from lvol2, dump, and double up as swapdevice, device that can contain a full crashdump, 35gbyte + 1Gbyte for the kernel, that acts as primary dump device, instead of vg00/lvol.

A seperate dump/swap device, because this will help the system bootup faster, after the crash, as it, will automatically push the startupscript "saving crashdump from swapdevice to /var/adm/crash" in a "background" mode, something that will not be the case if vg00/lvol2 remains the primary swap device, which would instead "wait" for the "saving crashdump from swapdevice to /var/adm/crash" to be completely "saved" before going on to the next startupscript.