Re: simulation server

Gustiuc · ‎02-16-2007

Hello, I am a PhD student and I run a CFD (computational fluid dynamics) code on a HP server with six processors (0-5).

The top command gives me:
Load averages: 1.68, 1.70 1.75

CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
0 1.77 …
1 1.86
2 1.66
3 1.74
4 1.83
5 1.22
Avg 1.68 1.0% 98.1% 1% 0.0% 0.0% 0.0% 0.0% 0.0%

Is the server overloaded? I reduced the number of simulations that are running and the time for one simulation to be completed rest the same.

Thank you,
Mihai

Fabian Briseño · ‎02-16-2007

Hello Gustuic.
What OS are u using (version)?

How did you arrive at the conclusion that the server is overloaded. ?

Since when is it running low ?

Knowledge is power.

Dennis Handly · ‎02-16-2007

>I reduced the number of simulations that are running and the time for one simulation to be completed rest the same.

This implies either you aren't overloaded, or you are very overloaded. What about the Cpu states lines??
Ah, your average is 98.1 % nice. You have the machine pegged.

What is your machine model?

Ninad_1 · ‎02-16-2007

Hi,

Can you also post your OS version along with output of
sar -u 5 10
sar -q 5 10
vmstat 5 10

Regards,
Ninad

Gustiuc · ‎02-16-2007

We are four (sometimes more) PhD students that use the same type of code and we would like to know if the use of the server is optimal.

The hp server is (I hope I understood correctly) hpux64

The outputs look like:

% sar -u 5 10

HP-UX n4k1 B.11.11 U 9000/800 02/17/07

13:00:02 %usr %sys %wio %idle
13:00:07 99 1 0 0
13:00:12 100 0 0 0
13:00:17 100 0 0 0
13:00:22 100 0 0 0
13:00:27 100 0 0 0
13:00:32 100 0 0 0
13:00:37 100 0 0 0
13:00:42 98 2 0 0
13:00:47 100 0 0 0
13:00:52 100 0 0 0

Average 100 0 0 0
% sar -q 5 10

HP-UX n4k1 B.11.11 U 9000/800 02/17/07

13:06:58 runq-sz %runocc swpq-sz %swpocc
13:07:03 1,2 67 0,0 0
13:07:08 1,2 67 0,0 0
13:07:13 1,2 67 0,0 0
13:07:18 1,1 73 0,0 0
13:07:23 1,0 83 0,0 0
13:07:28 1,2 70 0,0 0
13:07:33 1,3 67 0,0 0
13:07:38 1,2 67 0,0 0
13:07:43 1,2 67 0,0 0
13:07:48 1,2 67 0,0 0

Average 1,2 69 0,0 0

% vmstat 5 10
procs memory page faults cpu
r b w avm free re at pi po fr de sr in sy cs us sy id
11 0 0 1502813 51383 9 3 1 0 0 0 29 1331 845 334 86 1 13
11 0 0 1502813 51312 10 2 0 0 0 0 0 1169 517 260 100 0 0
11 0 0 1504760 51307 3 0 0 0 0 0 0 1168 372 244 100 0 0
11 0 0 1504760 50683 0 0 0 0 0 0 0 1208 457 382 98 2 0
11 0 0 1503562 50707 1 0 1 0 0 0 0 1201 2849 331 99 1 0
11 0 0 1503562 50707 0 0 0 0 0 0 0 1187 1077 258 100 0 0
11 0 0 1501629 50707 0 0 0 0 0 0 0 1172 550 237 100 0 0
11 0 0 1501629 50691 78 25 3 0 0 0 0 1176 1762 281 98 2 0
11 0 0 1503624 50691 25 7 1 0 0 0 0 1184 717 238 100 0 0
11 0 0 1503624 50691 8 1 0 0 0 0 0 1180 382 233 100 0 0

Bill Hassell · ‎02-17-2007

There is no way to overload the computer. Your processes are all running as fast as they can. Since these processes are compute-intensive, you can run as many at the same time as you have CPUs in the system. Eacg process will complete independently at virtually the same time as if just one process is running. Like all timesharing systems, if you run more processes than you have CPUs, then compute time will be distributed among the processes.

Like all compute-bound processes, the best way to increase performance is to profile your program code to see where most of the time is being spent. If you are solving differential equations, look very carefully at the loops that are approximating the solution and also at the closing value. Setting the error limit too low can massively increase run times. Look also at computations in those loops -- avoid division at all costs by changing to multiplication, and make sure cute tricks such as ERR=ERR*ERR are not being used in place of ERR=ABS(ERR). Remove any constant assignments within loops. There are many other techniques to dramatically improve performance using numerical techniques designed for computers rather than pure math. A guy named Donald Knuth written some books and papers on the subject.

Bill Hassell, sysadmin

Hein van den Heuvel · ‎02-17-2007

Sure looks like your server is totally overloaded, or maybe 'optimally used' depending on your perspective.

But that line about reducing work and the time for a job staying the same worries me.
That could be explained by a poorrly executed experiment, or by a 'busy wait' user polling loop in the code.
loop:
if ?
then
else

Given 99% user mode, there is really no system tuning you can do. maybe a chatr for the executable to give it large pages for code and data?
`
Getting more CPUs, would most likely nicely allow you to get more work done, if that is a requirement.

fwiw,
Hein van den heuvel
HvdH Performance Consulting

Dennis Handly · ‎02-20-2007

>Bill: There is no way to overload the computer.

I would have to disagree with this. While it is true it won't go up in smoke, it may get slower than expected, when the processes compete for resources and you spend more and more time switching than doing useful work.

For a simple example of N CPUs and M jobs and the other resources infinite.

If you run 1 job at a time, (without threading) it will take M * R time. If you do N at a time (M==N), it will take M * R / N time. If you do N + 1 jobs, it may take (N + 1) * R / N, or it make take N * R / N + R. The former is with no swapping cost. The latter has the last job waiting. With M == 2*N, the latter is 2* N * R / N

And if the swapping cost gets so high, it can be worst than either formula.

Bill Hassell · ‎02-21-2007

Dennis is correct...when a program does anything besides pure computing, then overhead functions become part of the total runtime. This is why system CPU percentage is useful -- high system overhead means less time for processes. When you hit 100% on all CPUs, it is a combination of user and system time. Now for computationally intensive tasks (fluid dynamics), the process spends most of its CPU time and in this case, each additional process that is similarly compute bound, an additional CPU is needed to prevent timesharing delays.

You are probably concerned most about wall clock time (time to complete) each process so there will be a point where all CPUs are busy and any additional processes will starting slowing completion time. The system is running full speed but it can't devote all CPU time to every process so timesharing overhead becomes noticeable.

As mentioned, the biggest reduction in completion time will come from profiling the code. Typical compute-bound programs spend about 80% of their time in just 20% of the code. Optimizing this code will greatly improve application performance.

Bill Hassell, sysadmin

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: simulation server

simulation server