Shared memory allocation

Mark Poeschl_2 · ‎01-06-2006

We are migrating our production environment to two GS1280s each with 8 CPUs and 2 GB of memory attached to each CPU - uniformly spread out. As I recall, it was a pretty hard HP recommendation with large databases running on the Galaxy class machines to pre-allocate and spread your shared memory segments amongst the RADs using the 'rad_gh_region[n]' kernel parameters. For the GS1280 it looks like each CPU is its own RAD, vs. each QBB as in the Galaxies. At least that's what the output of 'vmstat -R' shows me. Given the reduced latency in the GS1280's interconnect scheme, does it still make sense to use 'rad_gh_region[n]' parameters to pre-allocate and spread out shared memory segments? Our application is a single large database (not Oracle) with 1500+ simultaneous users during peak times.

Any thoughts or better yet hands-on experience with this will be much appreciated!

Ivan Ferreira · ‎01-06-2006

I think that big pages are more efficient than granularity hints. We use big pages (The systems are not too gib, GS160 8 CPU 16 GB RAM). To see a detailed description about that, check the system configuration and tuning manual.

Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?

Hein van den Heuvel · ‎01-06-2006

I agree with Ivan. Howeveer, there have been some support concerns with big-pages so you may want to so a solid test-drive, and be sure to use the latest patch kits.

The RAD-ness of GH memory allocaiton was IMHO secondary to the bigpage 4MB translation buffer page size. I also always thought GH was also a clear mechanisme to set aside memory for a specific (buffer cache) purpose, but it was infexible at that.

With bigpages you get the 4MB granularity, with flexibility, and applicable for data as well as code hopefully reducing DTB misses as well as ITB misses.

If you have a single big database, with all processes toucing all data , there is little you can do with NUMA locality anyway, and as indicated teh relative latencies on teh GS1280 really do not make it critical to think about it.

The only time I would consider exploiting numa tools like RUNON -R for the gs1280 is when the application has multiple distinct big process groups. Maybe a production database as well as a development and training database. For such circumstances I would consider running one group on a selected set of adjacent rads, and the other application(s) on the remaining RAD reducing the average memory latency for both.

For example, for a 16 p GS1280 the RADs might be labeled as:

00 01 02 03
04 05 06 07
08 09 10 11
12 13 14 15

The latencies from cell 0 could be roughly:

082 141 177 149
133 169 216 176
176 215 250 216
150 184 228 192

So if you pick 00,01,02,03,04,05, 07,08,12,13,15 for one set of apps, and give 06,09,10,11,14 to the an other set of apps, then the large set will never need a 5 hop path, and the small set only 2 hops at the most.
[Ok, I have not calculated all the path's, but something along those lines no?]

fwiw,
Hein.

Hein van den Heuvel · ‎01-07-2006

Ok, hop-counts are of course 0 based. What I called 5 hops was just 4, but still.

I happened to stumble into the 2002 Numa paper.
It is good to re-read that:
It uses the same numbering, but the picture is much nicer tha the cryptic matrix I provided.

http://h30097.www3.hp.com/docs/base_doc/DOCUMENTATION/V51_HTML/NUMA/TITLE.HTM

btw.. they also made a counting error/oversight:

" Three hops away if the resources are in RADs 5, 10, and 13"

... RAD 14 is also 3 hops, and RAD 12 4 hops

In the kernel there is a hop-count table, I believe from every rad to any other rad to help the scheduler algoritme (leash).

hth,
Hein.

Mark Poeschl_2 · ‎01-09-2006

Thanks all for your input. Last time I checked with our database vendor they were pretty hard-over against use of big pages, but I'm guessing that was just because it was new to them at the time. I'll try that route again.

If the vendor won't go along with big pages, would it be reasonable to fall back to the (admittedly less flexible) rad_gh_regions approach?

Hein van den Heuvel · ‎01-10-2006

[since no others chime in]

Yes, GH is desriable if big-pages can not be use.

There are several other topics in this forum with discussions in this area:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=846477

- Be sure to set a high minimal threshold to avoid inintended shmget calls to allocate from it

- Be sure to make gh allocation failures none fatal in case the need changes (in a few years) when you are not around/able to adjust the current size.

- Check the usage with cda/crash, or my tool as descibed in topic:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=644238

hth,
Hein.

Adam Garsha · ‎01-10-2006

When I worked at XXXX and the GS1280's were still called "marvels", big pages, granularity hints, and attempts to use RAD GH regions all proved unfruitful during benchmarking. In fact, they had brought system instability.

That said, I submit that little investment in AlphaServer benchmarking has likely occurred at XXXX for some time. So there may be something here that is usefull, but no other XXXX customer uses.

I believe that by default, that shared memory will be distributed across the RAD's in default mode, but I can't remember for sure.

As you ramp up during testing and golive keep an eye on your QBB memory usage, using collect -s m. Sometimes you'll see one QBB (that is next to an IO hose or something) getting quickly consumed (~0 free) while others do not. If you do, approach HP for solutions.

Mark Poeschl_2 · ‎01-10-2006

Thanks to all for your comments.

Adam, I think you might be getting the GS1280 (Marvel) machines mixed up with the prior class of machines. The GS1280 does away with the concept of the QBB entirely. And good luck in your post-XXXX endeavors ;-)

Adam Garsha · ‎01-11-2006

What I said was for the Marvels, but I shouldn't have used the term QBB. Replace QBB with RAD.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Shared memory allocation

Shared memory allocation

Re: Shared memory allocation

Re: Shared memory allocation

Re: Shared memory allocation

Re: Shared memory allocation

Re: Shared memory allocation

Re: Shared memory allocation

Re: Shared memory allocation

Re: Shared memory allocation