Operating System - OpenVMS
1826235 Members
2781 Online
109692 Solutions
New Discussion

Re: Itanium Dual core performance (-v- Alpha SMP)

 
SOLVED
Go to solution
The Brit
Honored Contributor

Itanium Dual core performance (-v- Alpha SMP)

Hi folks,
Does anyone know where I might be able to get information on performance comparisons between Alpha 1GHz processors (4-way ES45's) and Itanium Dual-core processors, i.e. those used in BL860c and BL870c blade servers.

I don't run any graphical apps, just the old basic "command line" legacy stuff.

I am trying to get a feel for how many "cores" I will need to replace my current alpha's.

My other questions is; How would an Itanium 2 processor (4 core) SMP configuration (OVMS 8.3-1h1) compare to an Alpha 4 processor SMP configuration (OVMS 7.3-2). (I know this is very sensitive to the type of apps I am running, however I would appreciate even general comments.

any available links to performance comparisons would be appreciated.

thanks.

Dave.
13 REPLIES 13
Hein van den Heuvel
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

Equal memory?

As long as the application does not do silly unaligned reference the Integrity should win out by a lot.

The best (OpenVMS) resource for this is probably a recent copy from Greg Jordans slide deck. (Bootcamp, European tour, ...)

I expect you to see a lot more benefit from user code then system code.
What is MONI MODE showing over a typical window? Takes half of the usertime away and you might be in the ballpark.

Earlier ITRC discussions in order of perceived relevance.

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1119052

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1014899

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1134802

and

http://www.techiegroups.com/archive/index.php/t-120579.html


Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting
Hoff
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

In terms of the completely unscientific "feel", single-user typical OpenVMS programming using DECterms and such (similar DECwindows and compile, link, run, debug and related activities on both boxes), a dual 900 MHz McKinley-class Integrity rx2600 box (which is older and slower than any of the the officially-supported rx2600 Itanium processor configurations) was at least on par with an AlphaStation XP1000 EV67 667 MHz box, and quite often faster. The 900 MHz McKinley processor is a very old and very slow Itanium processor.

Multicore is a cost-reduced and higher-density form of SMP.

It's arguably entirely the apps. Use the HP Test Drive systems or a loaner box, and get a feel for what you might need for performance.

The unaligned references mentioned earlier are a key consideration on Itanium. They're a performance hit as compared with aligned references on Alpha, but they're much slower on Itanium. In years past, it was an oft-cited 1:100 versus 1:1000 performance difference for unrecognized unaligned references on Alpha and Itanium, IIRC.

Stephen Hoffman
HoffmanLabs LLC
The Brit
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

OK,
Thanks so far. The next question is; On a multi-core itanium machine does the VMS scheduler assign processes to a "core", or to a "processor", i.e. on a 4-way Alpha, there are 4 cpu's and so 4 processes can occupy the cpu's at any given time.

Now, on a 2 processor, dual-core (i.e. 4 cores) machine, is it able to handle 4 processes per quantum, or only 2??

I realize that if it is 2x as fast, then the 2 processor machine can handle 2x as many processes per unit time than the 4 processor machine, however I am trying to get a feel for what to expect on a system with a large number of interactive users. On our system, we have large numbers of interactive processes being handled, and this keeps all of the CPU's busy, (but not overwhelmingly so, ~50% during the 'busy' period)

The real question is, do I need a replacement system with the same (or greater) number of Processors, or just an equal (or greater) number of cores.

finally,
Is there anyone out there who has upgraded from an alpha SMP machine to an itanium multi-core machine, and is prepared to share their experiences.

Thanks

Dave.
Volker Halle
Honored Contributor
Solution

Re: Itanium Dual core performance (-v- Alpha SMP)

Dave,

OpenVMS I64 schedules processes per core. The cores are seen as individual 'CPUs' at the OpenVMS level ($ SHOW CPU).

If you turn on HyperThreading, this will double your 'CPUs'. OpenVMS I64 detects hyperthreading and the scheduler tries to take this into account when scheduling.

With hyperthreading turned on, 'CPUs' that share the same core are called Cothread CPUs.

Volker.
Tom O'Toole
Respected Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)


Has anybody gotten a throughput improvement using hyperthreading?
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Robert Brooks_1
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

Has anybody gotten a throughput improvement using hyperthreading?

--

Yes, but it is so application-dependent that one must perform their own local tests to see if will help or hinder performance. It's pretty difficult to give broad recommendations.

-- Rob
The Brit
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

Thanks Volker, for your input. That more or less answered our question, and your reply was independently confirmed by my boss, who was discussing it with someone at HP.

Thanks again folks.

Dave.
Ian Miller.
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

Guy from BRUDEN-OSSG did some work on looking at performance improvements with hyperthreading and its very application dependant.

____________________
Purely Personal Opinion
Hoff
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

"Cores" are "processors". Nothing more, nothing less. The scheduler knows this.

Threads are implemented differently between x86 and Itanium (and differently within various generations of x86), and current Itanium hyperthreads should be thought of as a mechanism for a fast context switch.

Fast switching and the core contention that can arise may or may not assist some applications, and thus threading is selectively enabled. Threading is disabled by default, at least when last I checked.

When threading is enabled, the scheduler knows about how threads share cores, and knows to schedule a second thread on an already-active core only when it really needs to.

Some background reading on SMT and SMP and hyperthreading:

http://64.223.189.234/node/13
http://64.223.189.234/node/608
Tom O'Toole
Respected Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)


When SMT is enabled, VMS shows the hardware as having 8 processors, right? So in monitor/system, the maximum will be 800%, and if we're seeing more than 400% utilization, we are definitely getting a throughput increase. Is it as simple as that?
Can you imagine if we used PCs to manage our enterprise systems? ... oops.
Hoff
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

On Itanium, the threading implementation is effectively a fast context switch.

The SMT or hyperthreading or multithreading construct is inherently implemented with less than the resources of a full core. It usually falls somewhere between a fast context switch and something less than the full resources and capabilities of a full core.

If the full resources of a core were available, then the construction and the implementation would be called a core. Not SMT.

SMT is a way to try to use idle units within a core.

OpenVMS I64 presents the Itanium threading model as a core, but the scheduler "knows" it isn't really a full core.

http://64.223.189.234/node/13
Hein van den Heuvel
Honored Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)

>> and if we're seeing more than 400% utilization, we are definitely getting a throughput increase. Is it as simple as that?

Unfortunately NOT.

Multithreading exploits MEMORY STALL TIME, for one thread as an opportunity to issue non-memory instructions for an other thread after a very-very light weight cpu-internal context switch.

So if we take a over-simplyfied example of two jobs doing cache-only CPU bound work, then there will be no significant memory cycles and a simple fairness algoritme will keep both threads working. To the OS both will look 100% busy, but only one will really be working at a time. The elapsed time for two jobs in this case will be exactly as long with hyperthreading on or off, and probably longer due to context switches. So the perception will be that it is the worst of all worlds: takes twice as long and uses twice the CPU. In reality the hyperthreading in the above probably had no significant effect.

The only, and only true, measure is end user application level performance such as wallclock time for a given business load/job.

fwiw,
Hein.
Tom O'Toole
Respected Contributor

Re: Itanium Dual core performance (-v- Alpha SMP)


OK, that's kind of weird, it updates both processes accounting data structures, I guess because it doesn't distinguish which one is active (perhaps because it would take too long)? Thanks for the info, this is pretty cool.
Can you imagine if we used PCs to manage our enterprise systems? ... oops.