Re: Hyper-Threading better pereformance ?

Mario Dhaenens · ‎07-06-2009

Hello,

Has anyone enabled hyper-threading on I64 servers with OpenVMS 8.3-1H1
When does it give a better performance?

P.S.
To change the hyper-threading state, use the EFI command cpuconfig threads off or cpuconfig threads on, and in either case followed by the reset command.

/Toine

Hoff · ‎07-06-2009

If HP thought the benefit were generally applicable or if HP was comfortable with the performance trade-offs here in general, then HT would be enabled by default.

Barring unusual prescience here, the best available answer is unfortunately "it depends".

Per HP: "The effect that hyperthreads have on performance depends heavily on the application mix that is running. HP recommends that you start with hyperthreads turned off and experiment later. Two CPUs that share a core when hyperthreading is enabled are referred to as cothreads. The SHOW CPU/BRIEF and SHOW CPU/FULL commands now provide information about cothreads. For example:..."

As for related reading:

http://labs.hoffmanlabs.com/node/608
http://labs.hoffmanlabs.com/node/917

If the application code stalls extensively due to cache misses and has multiple processes in the scheduler compute queue, then HT might help. If the code is taking page faults or other operations or doesn't have a stuffed COM queue, then HT might not help or might potentially slow the code.

It'd be nice if there was a "qualify" tool here to see if HT is worth the bother, but as yet that's not available. But then I'm old enough to remember the "it depends" that was the VAX-11/782 ASMP configuration, too.

John Gillings · ‎07-06-2009

From the description, it appears that wins from hyper threading depend on L3 cache misses.

There's an SDA plugin to measure them:

SDA PRF profiling collects cache miss events.

PRF START PROFILE - start miss event profiling
[/BUFFER=n] - size of trace buffer
[/CPU=n] - list of CPU's to run profiling (defau
[/CACHE={L1,L2,L3}] - trace instruction and data cache load
[/INDEX=pid] - PID of process to run profiling (defa
[/TLB] - trace instruction and data TLB misses
[/THRESHOLD] - threshold value for overflow counter
[/MODE=(K,E,S,U)] - allows PC sampling for one or more sp

There's one more piece of data you need to determine which are actual L3 misses. PRF INFO will tell you the latency in cycles for L1, L2 and L3. Any latency you see in PRF SHOW PROFILE, which is greater than the L3 latency you see with PRF INFO, is an L3 cache miss. A cache miss may or may not be an issue, it just indicates that you had to go to main memory for the data. It doesn't say that the CPU stalled because of the miss.

A crucible of informative mistakes

Hein van den Heuvel · ‎07-06-2009

'It depends'. But you knew that.

As Hoff says, you need to have a significant COMputable thread queue, while it counts.

If there is nothing to schedule on the extra CPU's then it will not help.

Those computable processes must NOT be spinning or serializing (MPsync) but be ready to run independently, on independent chuncks of memory (not waiting for the same cache lines).

The HP position is prudent/safe. Maybe excessively so. Basically, they don't want to have any (small) potential to hurt the folks that are ot paying too much attention, those that are not reading the release notes, notably because the upside potential is modest.

The folks that do understand the material some, those which know to ask the questions (like yourself), and find the know, will likely venture out anyway. They are likely to set a reasonable expectation. They know not to expect 2x, or even 1.5x but hoping for a 10% - 20% more throughput... and will verify that somehow.

I think there is more potential than HP lets us believe from the low key indicators of its availabilty. And admittedly the (mostly artificial?) HP OpenVMS Engineering benchmarks did not raise much hope (Burns's work, Greg Jordan and such).
Still, I have good hopes in general and would encourage experiments (week on, week off?).

1) OpenVMS Engineering spend a significant effort trying to make the scheduler do 'the right thing'. It would be an insult not to try use it ! :-)

2) the HPUX Itanium benchmarks for TPC and SAP all have it enabled. They would do that only for one reason... because it helped them!

Good luck,
Hein van den Heuvel
HvdH Performance Consulting.

Mario Dhaenens · ‎07-07-2009

Hello,

Thank a lot.

I will turn on the hyper threading for a week end turn it off and look at the performance results.

I have many java based applications which uses many threads and can generate a high compute queue.

/Toine

Mario Dhaenens · ‎07-07-2009

Thanks,

I will keep you informed about the hyper-threading results.

/Toine

Hein van den Heuvel · ‎02-27-2010

>> I will keep you informed about the hyper-threading results.

/Toine (Mario?) did you ever reach a conclusion you can share here? Please?

Thanks!
Hein.

Toine_1 · ‎02-28-2010

Hein,

I have enabled hyper-threading on a Rx6600.
On that server I run Oracle 10g, Rdb, SQL Services and many Java based applications and Fortran based applications. (In total 1100 processes.)

Since I have enabled hyper-threading the average compute queue has gone down from 3 to 0,5.

So I think when you have many processes waiting for a CPU it is good to enable hyper-threading. Now the compute queue is almost 0.

I have hyper-threading enabled and running for 5 months and I will not disable it. I will keep running the I64 servers with hyper-threading enabled.

P.S.
I also run a BL860c with hyper threading enabled and also this one is running OK.

/Toine

Ruslan R. Laishev · ‎03-26-2010

Hello!

I have enabled the hypertreading on rx 7640 OpenVMS 8.3/8.4FT, some multhithreaded application has started regular crashing...

Hein van den Heuvel · ‎03-26-2010

Ruslan wrote...

"I have enabled the hypertreading on rx 7640 OpenVMS 8.3/8.4FT, some multhithreaded application has started regular crashing..."

Hi Ruslan,
Until proven otherwise, we'll have to assume that your application is broken in some subtle way.

Why not open your own topic and provide some more details.

- what does such crash look like? ACCVIO?

- You write 8.3/8.4FT.
I read that as " I tried under 8.3 with enabling hyperthreads and it crashed. Then I tried under 8.4FT and it also crash in a much similar way." Is that the correct reading?
If so, that should be expected since this is an application bug, so a new OpenVMS version will not fix that! :-) :-)

- SYSGEN parameter MULTITHREAD setting > 1? did that change at the same time?

- What platform : how many CPU/threads.

- Let's assume a 2 CPU box, 2 core each.
And let's assume CPU 0 has thread 0 and co-thread 2, CPU 1 has thread 1 and co-thread 3.
Now play with start/cpu and stop/cpu to build out a matrix.
I assume you tried
(NO HT) 0 + 1 --> ok
(HT ON) 0 + 1 + 2 + 3 --> application crash
How about
(HT ON) 0 + 1 --> ?
(HT ON) 0 + 2 --> ?
:

Hope this help to create a better question.
Cheers,
Hein

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Hyper-Threading better pereformance ?

Hyper-Threading better pereformance ?