ProLiant Servers (ML,DL,SL)
Showing results for 
Search instead for 
Did you mean: 

Inconsistent Performance with DL785 using MPI

David B Hart
Occasional Visitor

Inconsistent Performance with DL785 using MPI

We are using a DL785 G6 system with 8x Opteron 6-core processors (48core) with 256 GB total memory. We are running a computationally and memory intensive model, which uses MPI, testing the scalability of said model. System OS is RH Enterprise Linux 5.5.

Running on (for example) 12 cores produces inconsistent performance results - run times vary greatly depending on which processors get assigned to the tasks. Running on increasing numbers of cores yields diminishing returns.

Running a single threaded test, 47 instances of a calculation of PI to 100,000,000 digits produced strange results as well. Performance for each instance was consistent (seconds per digit calculated) while all 47 threads were running - however, 28 threads performed at speed X, while the remaining 19 cores performed at 75% to 50% of the speed of the fastest cores. Memory availability was not an issue for this test (memory footprint of the program was small). After some threads terminated, a visible improvement occurred in the remaining threads - whether they actually migrated processors or just improved performance is unknown.

Does anyone know why the same program running on each core would perform significantly slower on some cores when all processors are identical?
Jan Soska
Honored Contributor

Re: Inconsistent Performance with DL785 using MPI

Hello David,
we use DL785G5 (32cores, 128GB ram) as MS SQL data warehouse.
Server performance is great, but we do different kind of task than you.