Re: DBA Claim: Increasing SGA will reusult in more CPU Activity. True or False?

Alzhy · ‎03-02-2007

There've been great debate in our orgnisation about how large of an SGA is needed / beneficial for a DB instance that is officilly OLTP but from time to time behaves like DSS (with unpredictable very large queries and reporting schemes).

Also there is great debate as to how large an SGA can be provided that wil actually be benefifical instead of useless the fact that some of us pointed to "recipes" out there that simply point to "the biggest SGA - the better".

But one thing that caught my attention is a claim from a DBA who says that increasing the SGA will mean more CPU demands. Is this true?

I thought I/O from memory is less CPU intensive as I/O from disks?

Thanks

Hakuna Matata.

Oviwan · ‎03-02-2007

Hey

If you increase the SGA then the DBA can read more direct from the memory and this mean more CPU demands.

You can do a Snapshot of the DB and have a look at the Buffer hits value. If the value is >90% then the DB reads the most from the memory. It's the best for the DB to have a lot of memory.

We had a problem with the disk I/O. after we increase the SGA (Buffer Cache) it's calm.

Regards

A. Clay Stephenson · ‎03-02-2007

Yes, it's true but that's generally a good thing. Once the data have been found, the CPU can go back to work so that the faster data can be loaded in the less time the CPU spends waiting on I/O -- and the busier the CPU is.

Now there is some overhead in finding data in large caches (whether that is a database SGA or in a buffer cache) but normally these algorithms are quite efficient and the overhead is well worth the cost because the buffer cache searches still are less expensive than going to disk for data.

When frequent disk i/o is needed, the CPU spends much more time twiddling its thumbs and so the CPU load goes down -- but so does throughput.

I really don't care how busy any performance component is, I really care about overall throughtput.

In any event, the correct approach is to gather metrics and to vary the size of the SGA until an optimum setting results. By far, the most important thing to avoid is growing the SGA to the point that paging occurs because the overhead from swapping will be at least 100x worse than any performance gains you might get from going to the disk less often.

If it ain't broke, I can fix that.

TwoProc · ‎03-05-2007

Nelson,

I used to follow that advice, that a too-large SGA just costs the cpu time & energy. However, *at times* I now disagree with the theory, given the BIG caveat that running code is ALREADY properly tuned, adding more to the SGA ~may~ yield HUGE performance gains - If the problem is that you're I/O bound.

The old issue was that it cost more cpu to maintain a large SGA, so it was best not to make it large. I first heard and believed this when typical cpus for Risc systems ran at 50-75Mhz, and back then, at those speeds it may have even been true. You've got to ask yourself though, if this rule is still relevant when the types, speeds and efficiencies of hardware have changed. I'm not saying that the rule did or didn't apply back then or even now, but after this much time and change, you must at a minimum, requestion the theory.

However, if you consider that a lean memory database server must - at peak loads - maintain busy log buffers, and a busy buffer cache for data area, plus a busy shared pool - then a "lean" memory system adds ALL of the cpu load and maintenance of forcing dirty buffers to disk and pulling other data areas back into cache becaused of memory limitations - plus the disk I/O itself. On the other hand, if the server had plenty of cache handy during peak loads and didn't have to rotate out data to/from the storage and kick of dbwriters( or I/O slaves) to once again, query memory, check for locks and latches, maintain dirty flags, locks and latches during peak periods ... consider possibly how much LESS the cpu would have to do.

I believe that an error in judgement is possible that comes from merely looking at the buffer hit ratio being over some number (90%, 95%, 97%, etc). What one must do is examine that amount of I/O being done at these levels. IF for instance, at a 90% hit ratio you've got very little disk I/O for your nice big XP12000 to handle, and you've got no performance issues - then you're probably fine! That could also be said of even only a 75% hit ratio, right? In other words if the 75% hit ratio - for whatever reason - generates only 100 I/O requests per second, and your system is capable of 10,000 I/O requests per second - you probably don't need to fix anything in the buffer pool(db_cache_size). If you have a performance issue - it's probably not there.

The point to the previous section merely states the issue for the converse case - it's truly the same. Suppose you've got a nice big fat 97% hit ratio - yet you're still doing 10,000 I/O's per second ( this number is just an even number to just do easy math on - doesn't really mean anything).
The BIG POINT to realize here is that the 3 missing percent (100-97 hit ratio) - at peak times - represents almost 100 PERCENT OF YOUR TOTAL I/O. This points out why it is useless to rely totally on the percent hit ratio. Why? Because if you suddenly could attain a 98% ratio - you'd probably cut total system I/O by by 1/3 (disregarding redo logs)! Your total number of I/O's per second should drop to around 6,700 (from 10,000). This means that you'd have 1/3 less the I/O problem. And while it might increase the load on the cpu a bit, it might be worth it. Suppose at the peak of the day (or the multi-peaks of the day) you've got some spare cpu, and you're near max on I/O - then the possibility of shifting demand resources to the cpu side makes sense. I contend that it may even require LESS cpu at peak to manage a larger cache than a smaller one. Why? You wouldn't have the cpu load of switching and maintaining the items in and out of the buffer cache area. Wouldn't it be easier just to set flags on a busy area in cache as needed than repeatedly shuffling it out to disk, dragging it back, marking it up to be moved and removed out of cache - forcing it back and forth in and out of the cached areas at peak times? I might be (depends really) on the type and how much right? You wouldn't know unless you measure. I've certainly observed this very thing myself in some test cases. That is, AT PEAK TIMES, a bigger SGA actually consumes LESS CPU than a smaller one! However, at off peak times, it would probably consumes more cpu to maintain the large sga. My point here is, what do you normally have to tune for? Answer: peak periods! Normally no-one gets too beat up about the necessity of increasing performance at non-peak periods, but it can happen - like a long running, off-peak, accounting cycle run that takes too long, for example.

Suppose you've determined that you want to increase your buffer hit ratios at peak. How to accomplish this? Well, first off eliminate unecessary disk I/O via tuning and timing. Tune everything that MUST run at the peak period, and move everything else that doesn't have to run at peak periods to off-peak periods.

Secondly, *test and examine* to see if you increase the SGA size whether or not your performance issues decrease. These increases should be considered for which area?
It could be that your shared pool isn't big enough. Do you see a lot of code being reloaded at peak periods? Thankfully, if this is the case - it's easy to fix since this is usually a small area .

What about your sorts on disk - is your sort_area_size large enough for individual processes? Remember -> this can be changed per session if needed for certain processes and does not need to be changed for the whole database instance and all live connections. This is appropriate when it is just a few programs which take your sorts on disk numbers too high, and the problem is not universal.

Lastly (last because it takes so much more to get more) - can you increase the size of the buffer pool(db_cache_size)? If you can get the hit ratio up there by another percent or more, it may be possible to seriously increase your total system throughput at peak demand times by dramatically reducing total I/O .

So, consider your type and total I/O's at peak against your hit ratios TOGETHER before deciding whether or not an increase in size would benefit your database service - not merely your hit ratio or total I/O's alone. Even a high-hitting ratio (like 98%) could possibly benefit from more buffer_cache (db_cache_size) when the total I/O load is considered high for your storage system (or even a storage system sub-component - like an interface card). If you could get your hit retio to 99%, you could conceivably cut most of your total disk I/O at peak in half! You'd now have 1/2 to I/O thoughput problem. Now consider what's cheaper a) more memory for your server which could cut total I/O at peak in half, or b) increasing storage server throughput to be TWICE as fast? Answer in most cases, a) more memory. As far as costs go, keep in mind that HP charges no additional maintenance fees on contract support for ram.

Going sideways off topic a bit - even though it's not covered by your question I'll add two more things that are often overlooked which can trip up a busy server that is already otherwise configured properly for the amount of I/O your storage system can handle. Of course, there a lots of others, these are just two I find that are often overlooked. One because of guidelines, and the other because of defaults.

1) Redo log buffers too small. If this is the case, your supposedly efficient database could be slogging, just because of this being undersize. Increasing this becomes a very minor memory issue. Improvements from increases in this area are cheap and easy to notice if it is causing problem. If you read the prototype comments in an Oracle init.ora file it says
# The log buffer can be a megabyte or more but given the commit frequency
# in an OLTP environment, little benefit is achieved above 5M. It must be
# a multiple of redo block size, normally 512 bytes." However for some databases I'd have to disagree, I've seen huge gains on some systems well above 5M. Interpreting this "guideline" as a "hard and fast rule" just doesn't cut it, as your system may be different. Keep increasing this number in your testing until your throughput stops getting better. Like many things - it will follow the law of diminishing returns - quit increasing when you are happy with the amount of performance gain for the amount of incremental increase in memory use.

2) On your busiest tables at the peak periods - make sure that the the tables aren't at the default inittrans of 2. Increasing these numbers on the select few busiest tables ~and their maintained indexes~ can drastically increase total system throughput numbers. Don't forget the indexes! What works for me (and it would probably vary for you); is that I put "12" for my busiest tables and indexes, and "6" for the medium busy stuff. I find that having a bit too much here is much less costly than having too little. This does not mean that you should increase this number on all tables and their indexes everywhere. You should test for yourself to determine which numbers work best on which tables and indexes at your site.

Regarding your original question:

The problem here can be summarized by acknowledging that some DBAs mistake these excellent guidelines for standing up a server to to be hard and fast rules which can never be questioned. Testing and analysis of results will provide the correct configuration answers, not blindly following basic guidelines which have somehow over time become confused as absolute rulesets for database server configuration.

Lastly, I'd love to say I've never made these mistakes mentioned regarding these issues myself, but truthfully - I can't. :-) So take it easy on the DBAs as it is a tough job and these types of mistakes (which on another dimension resemble an organization behavioral "group think" types of mistake) are usually made while trying to pursue the very best interests of the company, which is what you want to get from employees.

John

We are the people our parents warned us about --Jimmy Buffett

Hein van den Heuvel · ‎03-05-2007

>> DBA Claim: Increasing SGA will reusult in more CPU Activity. True or False?

True with caveat below. The single reason to increase the SGA is to spend less time waiting for (disk) IO and thus get to burn cpu more agressively to get useful work done.

Please realize that for the given query the TOTAL CPU time may actually decrease as less cpu time is wasted to do the IO and spinlock arounf buffer access while in transit. The CPU usage will just be more dense, so it will look more busy as you get more work done. That is assuming there is an more or less infinite amount of work to be done.
If your application only requires a portion of the CPU available then you will not see (vmstat, sam, sar, top, openview) more CPU usage, but 'only' see better response time.

>> There've been great debate in our orgnisation about how large of an SGA is needed / beneficial for a DB instance that is officilly OLTP but from time to time behaves like DSS (with unpredictable very large queries and reporting schemes).

I'm with John in a prior reply.. the SGA needs to be large enough to drive the IO down to rates which are readily manegable by your IO system.
HOTSOS has some nice references on this, but I can not get to their site just now. Check: http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=554275

In statspack you want to looks at the section: "Buffer Pool Advisory"

>> But one thing that caught my attention is a claim from a DBA who says that increasing the SGA will mean more CPU demands. Is this true?

Yes, for intenses (dss) queries activity, with the above caveat.

My own Rule-Of-The-Thumb for OLTP-ish work is that once you get to 50% or better write ratio, adding more SGA is in the diminishing returns area as those write IOs will stay.

> I thought I/O from memory is less CPU intensive as I/O from disks?

The whole point about having the SGA is NOT to do the IO, and having the buffer there ready for use.

Of course, tuning being a 'whack-a-mole' game, the next or better yet FIRST step is to reduce the need od lookign at those buffers at all by having optimized queries.

Hope this helps,

Hein van den Heuvel
HvdH Performance Consulting

Less CPU time

Question There've been great debate in our orgnisation about how large of an SGA is needed / beneficial for a DB instance that is officilly OLTP but from time to time behaves like DSS (with unpredictable very large queries and reporting schemes).

Also there is great debate as to how large an SGA can be provided that wil actually be benefifical instead of useless the fact that some of us pointed to "recipes" out there that simply point to "the biggest SGA - the better".

But one thing that caught my attention is a claim from a DBA who says that increasing the SGA will mean more CPU demands. Is this true?

I thought I/O from memory is less CPU intensive as I/O from disks?

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=554275

Yogeeraj_1 · ‎03-06-2007

hi,

The major considerations when sizing the SGA are:

a) How much do you want to assign to your buffer cache for maximum performance

b) How big is your shared/java pool (a function of how much sql/plsql/java you run in your database, no magical number for all to use)

c) Do you run in shared server (than the large pool is used and will be large -- that is part of the sga) or in dedicated server -- then you need to leave OS memory for dynamic
allocations

d) What else is going on in the machine.

hence, this claim is quite debatable.

If you are running Oracle 10g, you would simply set sga_target and let the database itself figure out its ideal size empirically over time.

hope this helps too!

kind regards
yogeeraj

No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)

Bill Hassell · ‎03-06-2007

More SGA = more CPU...

In general, that is a very good thing. The reason is that disks are incredibly slow compared to memory, even with heavy buffering both in the kernel as well as in a modern array. Maxing out the CPU cycles is a good thing simply because the performance will be higher -- to a point. I have seen SGAs in the 8 to 12 Gb region with very high performance, but the content and layout of the SGA is also important.

A similar situation exists for the HP-UX buffer cache. Starting with the introduction of the dynamic buffer cache at 10.xx, an ultra large (multi-Gb) cache was generally undesirable since the amount of kernel time needed to search for items would negate gains made by eliminating I/O. Thus the 500-1500 megs recommendation for the DBC. But with IA64 and 11.23 changes to DBC code, the DBC shows significant improvement up into 5 to 10Gb range. And 11.31 promises even more improvements for handling large a large DBC.

The same with Oracle. A good hashing algorithm can make a large SGA excellent in handling cached records but allocating massive space to a temp sort area may not improve anything. And of course, major versions may vary significantly in large SGA handling.

So the answer is: It depends. ;-)

Bill Hassell, sysadmin

Eric Antunes · ‎03-08-2007

Hi Nelson,

It is true but what's the problem about that?

There is no holly formula for this but it is always a good idea to use an SGA in order with the following formula:

select sum(sharable_mem) from v$db_object_cache;

+

select sum(sharable_mem) from v$sqlarea where executions > 5;

+

select sum(250 * users_opening) from v$sqlarea;

+

20-30% overhead

=

610Mb (using my production database as an example)

Best Regards,

Eric Antunes

Each and every day is a good day to learn.

Alzhy · ‎03-08-2007

Thanks for your inputs so far but here's actually pur dilemma. These servers each host a database that are on average 2TB in size and 128GB of memory and growing (java orineted/weblogic middle tier).

Current SGA is only at 24GB.

During the day the DB mostly behaves in an OLTPish fashion whilst from time to time (unpredicatable) - we'll have a DSS like query or job that will run.

At night, it's purely batch and memory demands on the system are about only a third.

So we really have a "hybrid" DB environment here. So I am espousing for the biggest cache that we can give it so the online DSS jobs hunger for cache canbe satisfied.

Hakuna Matata.

Steven E. Protter · ‎03-08-2007

Shalom Nelson,

To the original question: Even if what the DBA's say is true why does it matter. DBA's need to be concerned with response time and the performance of the database. If they use a little more CPU but incresasing the SGA improves performance then whats the big deal?

As Bill said, its a good thing. If the cache hit ratio is already high, then increasing the SGA may not provide a huge performance increase.

If you have the memory, why not try you idea? In the end what is needed to prove or disprove you theory is a test, not our input.

Also: Bill Hassell related to me that in situations with HP-UX 11.23 and (I assume the next part) database on filesystem, increasing the buffer cache may improve performance. (Buffer cache can't help performance of a db on raw disk).

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: DBA Claim: Increasing SGA will reusult in more CPU Activity. True or False?

DBA Claim: Increasing SGA will reusult in more CPU Activity. True or False?