Oracle 10g and turning off/on hyperthreading while db is up

Hein van den Heuvel · ‎10-20-2010

Alzhy ->>Let us know how HT works out and if it is a GOOD thing for Oracle Databases (considering most oracle processes are not highly threaded processes).

Yes, concrete application based feedback is much welcomed, even if no two applications are alike.

Please note that the thread process comment IMHO is misleading/confusing. HT co-thread processors are full, independent, freely schedule-able entities.
They are available for any runnable thread, whether as only thread from fresh process or for a multi-threaded process.
You do NOT need multi-threaded applications to benefit from HT.
You do need lots of concurrently running threads any which way.
You also need a good bit of main memory stall s (= cache misses) to create 'micro-idle-time' to flip the threads.

TwoProc>> change the HT on and off every 60 seconds during a heavy load (40% cpu load)

Hmmm, I don't think that is a well constructed test for performance evaluation.
HT works best to create more total CPU throughput under high load... well over 50%, more like 80% (Such as tpc benchmarks :-)
Let's face it. If there is less than 50% CPU load then the OS can do best to not schedule anything on cpu where the co-cpu is already scheduled. In this case that would mean to frequently want to run many more then 24 cpus, 40+ or 60+

Turn up the volume to 11 (out of 10 :-).
For example, let's say that your suggested 250 user Mercury test creates 70% load on 24 non-HT CPU's. I expect those to use 40% or more CPU when switch to 48-HT... the price of more concurrency.
But now go to 400 users, approaching 100% CPU on 24-non. Typically response times will tank. Switch on HT and you'll find the system using 60 - 70% cpu... out of 48 with deteriorated but manageable response times.
And you may find you can add an other 50 users or so before approaching 90% cpu (out of 48) with acceptable response times.

The total throughput gain... in what would be overload situation for 24 CPU's is not unlikely to be 10% - 20%.
But it will not help didly-squat when using just 24 cpu's and may even hurt some, notably when there is lots of latch contention. (Cpu's actively spinning to wait for a memory flag to clear).

Hope this helps some
Hein van den Heuvel
HvdH Performance Consulting

Twoproc>> I could easily believe that it's possible that Oracle knows which procs are real and which ones are HT, and I could just as well believe that it has no idea. And, I think that the truth is that it's probably the latter.

The latter. Each thread in an HT enviroment is as real or as unreal as the next.
They can not be distinguished operationally other then by number.

Alzhy · ‎10-21-2010

Alzhy ->>Let us know how HT works out and if it is a GOOD thing for Oracle Databases (considering most oracle processes are not highly threaded processes).

Please note that the thread process comment IMHO is misleading/confusing. HT co-thread processors are full, independent, freely schedule-able entities.

Herr Hein, yours truly did not meant to "mislead" (as you allege) that highly threaded processes are a fit for HT processors. What I meant was since Oracle processes are not highly threaded - each Oracle fore/background process will have more access to "full independnet freely schedulable" processing entities with HT Threadin "ON". Now whether a "thread" processing entity has more "oomph" for Oracle (and some) processes pn REAL cores or THREAD processing entities is what I really am after as based on our experience there are certain processes that crunch logic much faster on REAL CPU cores.

But I do agree - there will be processing scenarios wherein the workload will have more processing capacity on HT Processors -- HT being a scheme of duping the OS into thinking there are more real CPUs that there are physical cores.

I have friends who work both for INTEL (HT-leaning) and AMD (historically insistent on real CORES). It is usually fun to have them both over a couple of b33rs. There were so many b33rs that flowed I could no longer remember what HT in Intel's approach or AMDs core persistence is all about.

But in the end -- it really all "depends Migz.

Hakuna Matata.

TwoProc · ‎10-21-2010

Hein, it's not a performance test to switch every 60 seconds (in fact the load is just there to generate some form of Oracle work, most any work would do). It's a durability test. I want to know if switching HT on and off is safe. I figure if I flip it off an on an insane amount for an extended period of time, and I've got no corruption issues, then its good to go.

Re: 250 users, not a magical number - it's all I've got for load testing licenses. It's the max I can push with the tool without spending lots more $$$, and I believe I can do a good job of load testing with that level of sim users. I can buy temporary user days, but I don't feel I need it for the goals of this particular test scenario.

As for getting the load higher than 40-50%, we do that by reducing and/or removing the timing waits in the scripts. No problem with that. The reason is that we simply cannot come up with a good Mercury test that simulates a room fool of folks doing their jobs. Even when we calculate the waits from sample users, from real user interactions, the load we then simulate WAY overloads the servers more than Mercury load runner does. The code being run is correct, the timing, even though averaged, is almost meaningless, either from empirical evidence, or from samples. When used, these loads are much higher than what actual users generate. I believe part of the blame is the Hawthorne effect, and the other part is that screen event logs and averages still give averages that are practically meaningless because the std deviation of number of items handled, the number of xxxx and xxxx and xxxx would have to be calculated, evaluated for lead/lag influences, stripped of coincident data with Durbin Watson tests, degrees of freedom for the above exploded... etc.

And, what I'd end up with is something like the average number of children per family in the US is 1.5. 1.5 is almost useless to simulate, you really need to simulate 0,1, and children, and use gaming to see if your distributions match real #'s.

We don't have that much time or $$$ resources to do that, when I can see the "real scenario" running right now from my monitors, and I can just tweek timing wait percentages until I feel I'm close enough to make value judgements - and I don't feel it's that difficult to do if looking at this process flow in its various mutations of its basic form for 12 years to judge its ability to satisfice a problem solving challenge... in other words, experience helps.

Anyways, thank you all for your valuable and esteemed input. Very gracious of all of you, thanks very much!