Advantage EX

Expanding the envelope for LS-DYNA blade off simulations

When a fan or compressor blade fails in an airplane jet engine, it’s a potentially deadly event. It makes fan blade off containment a critical design requirement for the aerospace industry. But improving these simulations is a compute challenge. A team studying the process made some key discoveries.


When a fan or compressor blade fails in an airplane jet engine, it’s a potentially deadly event. Failed blades release high-energy fragments that can perforate the engine case, damage fuel tanks, and cause catastrophic failures. Because of this extreme danger, the Federal Aviation Administration requires that engine cases be capable of containing blade fragments. In turn, it makes “fan blade off containment” a critical design requirement for the aerospace industry.

Of course, improving fan blade off containment simulation also makes for a compute challenge for users of Cray XC supercomputer and the finite element application LS-DYNA®. So a team from Cray and Livermore Software Technology Corporation (LSTC) got together recently to study how to achieve these improvements.

Fan blade off containment simulation is technically challenging and computationally intensive. For example, a large 80-million-element simulation using LS-DYNA version R7.1.2 takes more than a month to complete. But ideally, time-to-solution on this type of simulation should be less than a day.

First, the team needed to make improvements to the most computationally expensive part of the simulation—surface-to-surface erosion contact. To optimize the surface-to-surface process, they used the CrayPAT analysis tool to identify the most time-consuming subroutines. The resulting improvements included a 30% reduction in memory required for storing erosion contact surfaces, removal of redundant erosion calculations and faster exterior surface calculation.

Next, they needed to test the improvements. To do this, they first compared the performance of the simulation between LS-DYNA R8.0.0 and the earlier LS-DYNA R7.1.2 on a medium-size model of 26.5 million nodal points and 24 million solid elements. Then, they analyzed the MPI communication patterns and load balance among the MPI processes. Based on these results they identified the compute bottlenecks and made code changes to improve performance. And finally, they carried out the 80-million-element simulation using the enhanced LS-DYNA version R8.0.0 on the XC system.

So what kind of results did they see?

For the comparison test, the team focused on simulation performance during the collision phase (where the released blade collides with the engine case and the other two rotating blades), modeling these processes using surface-to-surface erosion contact. The team found that at 256 cores, the total elapsed time on LS-DYNA R8.0.0 was only about 1.9 times faster than LS-DYNA R7.1.2. But as core count increased, the speedup increased. At 2,048 cores, version R8.0.0 was about 2.7 times faster.

The team then analyzed the LS-DYNA wall time based on its functionality to determine where the speedup came from. Their analysis showed the speedup came from the “contact” and “miscellaneous” functions which were direct outcomes of the code optimizations.

Next, the team used Profiler, the Cray MPI profiling tool, to determine how much time was spent on MPI communication and which MPI calls dominated MPI time. Profiler revealed that total MPI time decreased as core count increased at lower core counts (256 to 1,024). The trend reversed at higher core counts of 1,024 to 2,048. This result revealed that MPI time is dominated by MPI synchronization time in the simulation and indicates load imbalance. Because of load imbalance, the parallel scaling of fan blade off simulation of the medium-size model is limited to 1,024 cores. However, for larger models, which are desired for future simulations, fan blade off simulations can scale to over 16,000 cores.

For their final test, the team ran a large model of the fan blade off simulation using 82 million nodal points and 80.6 million solid elements. With the earlier LS-DYNA version, the large model took more than a month using 16,384 cores on the Cray XC system. Using the optimized LS-DYNA R8.0.0 the simulation took only 21 hours—or 34 times faster than R7.1.2.

This blog originally published on and has been updated and published here on HPE’s Advantage EX blog.

Advantage EX Experts
Hewlett Packard Enterprise

0 Kudos
About the Author


Our team of Hewlett Packard Enterprise Advantage EX experts helps you dive deep into high performance computing and supercomputing topics.

Starting June 22
HPE Discover 2021
THE FUTURE IS EDGE TO CLOUD Prepare for the next wave of digital transformation. Join our global virtual event. June 22 – 24
Read more
HPE Webinars
Find out about the latest live broadcasts and on-demand webinars
Read more
View all