- Community Home
- >
- Services
- >
- The Cloud Experience Everywhere
- >
- Boosting HPC workload performance: Containerizing ...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Receive email notifications
- Printer Friendly Page
- Report Inappropriate Content
Boosting HPC workload performance: Containerizing HPC applications with MPI libraries
Containerizing HPC applications is a part of the modern way of HPC workload management use, especially for MPI-based applications. This blog explores the idea of HPC containerization in more detail.
By A. Yuvaraja, Apps Modernization Delivery Head, RPS, HPE Services; and Priyank Rupareliya, Technical Consultant, RPS, HPE Services
Parallel computing has become increasingly essential in the world of scientific research and high-performance computing. The Message Passing Interface (MPI) is a widely used standard for parallel programming, allowing researchers and developers to harness the power of distributed computing.
In this blog post, we'll provide a perspective on containerizing MPI-based applications and deploying a cluster using Docker swarm to run complex, multi-node workloads.
We'll also discuss the advantages of containerization and how to run performance benchmark comparison between containerized and non-containerized MPI applications.
Containerizing HPC applications with suitable MPI libraries
What is MPI and what are some of its alternatives?
Message Passing Interface is a standardized and portable message-passing system designed to facilitate communication and coordination among processes in parallel and distributed computing environments. It is commonly used in high-performance computing (HPC) and supercomputing systems to enable efficient communication between nodes in a cluster or across a network. MPI allows processes to exchange data and synchronize their execution in a parallel application.
How MPI Libraries communicate with different OS components and Hardware
While MPI is a widely used and well-established standard for parallel and distributed computing, there are some alternative programming models and libraries such the following:
- OpenMP: Open Multi-Processing is a shared-memory parallel programming model. It simplifies parallelization by adding directives to existing code to specify parallel regions. OpenMP is suitable for multi-core processors and is often used in tasks that involve parallelism within a single node.
- CUDA: This is a parallel computing platform and application programming interface (API) developed by NVIDIA. It is used for GPU programming and is particularly well-suited for tasks that can be highly parallelized, such as graphics processing, scientific simulations, and deep learning.
- OpenCL: This is an open standard for parallel programming of heterogeneous systems, which can include CPUs, GPUs, and other accelerators. It offers a way to write programs that can take advantage of a variety of computing devices.
- Pthreads: POSIX Threads (Pthreads) is a standard for thread programming in Unix-like operating systems. It allows programmers to create and manage multiple threads within a single process, enabling shared-memory parallelism.
- Charm++: This is a parallel programming model and runtime system that focuses on adaptive and migratable parallelism. It is designed for irregular and dynamic applications and can scale efficiently across distributed systems.
- UPC and UPC++: Unified Parallel C (UPC) and UPC++ are extensions of the C and C++ programming languages, respectively, designed for high-performance parallel computing. They provide a shared-memory-like programming model for distributed memory systems.
However, it is important to note that these libraries do not support communication spanning different hosts, thus making MPI the most suitable choice for HPC clusters. Many HPC applications use a combination of these models and libraries to achieve the desired level of parallelism and performance. In this blog, we will be focussing purely on containerizing HPC applications running on MPI implementation.
HPC because they are compute intensive. Applications that deal with training AI models, running chemical simulations, forecasting weather, processing large amounts of data are among a few to list.
The need for containerizing MPI-based HPC applications
MPI Applications are generally utilized in the context of HPC because they are compute intensive. Applications that deal with training AI models, running chemical simulations, forecasting weather, processing large amounts of data are among a few to list.
Containerization is the first step to support running such massive workloads on any platform of choice. It simplifies the process of deploying the applications on platforms such as Kubernetes, Docker Swarm, EKS or Rancher, which can scale up and down depending on the workload. This can save cost, run across different host operating systems, and allow room for flexibility in terms of hardware.
Steps to prepare a lean base image with the correct MPI Implementation which could be used to containerize one or more applications
Different MPI Implementations and our choice
There are several different implementations of MPI available, each with its own features, optimizations, and compatibility with different hardware and software environments. Some of the most well-known MPI implementations include:
- Open MPI: An open-source implementation that supports various network interconnects and is widely used in both research and industry.
- MPICH: Another open-source implementation known for its high performance and portability across different platforms.
- Intel MPI: A commercial implementation optimized for Intel processors, offering excellent performance on Intel-based clusters.
Aspect |
Open MPI |
MPICH |
Intel MPI |
MVAPICH2 |
Open-Source |
Yes |
Yes |
No |
Yes |
Community Support |
Strong |
Strong |
Limited |
Strong |
Portability |
High |
High |
High |
High |
Vendor-Optimized |
No |
No |
Yes |
No |
GPU Support |
Yes |
Yes |
Yes |
Yes |
CPU Support (AMD) |
Yes |
Yes |
Yes |
Yes |
CPU Support (Intel) |
Yes |
Yes |
Yes |
Yes |
Scalability |
Good |
Good |
Excellent |
Excellent |
For our purpose, we will be using MPICH since it is well maintained and up to date with the latest MPI standards.
Advantages of containerizing MPI applications
Containerizing MPI-based applications offers several advantages compared to not containerizing them. These advantages stem from the isolation and portability that containers provide. Here are some of the key benefits:
- Portability: Containers encapsulate the application and all its dependencies, ensuring that it runs consistently across different environments. This portability is especially valuable in heterogeneous HPC clusters and cloud computing, where the underlying infrastructure may vary. Without containers, adapting MPI applications to different environments can be complex and time-consuming.
- Dependency management: Containerization allows you to package all required libraries, dependencies, and runtime environments alongside the MPI application. This eliminates conflicts between different versions of libraries and makes it easier to manage dependencies. Users don't need to worry about installing specific libraries on the host system.
- Ease of deployment: Containerized MPI applications can be deployed quickly and easily. You can create a container image on a development machine and then deploy it on various computing clusters or cloud instances without worrying about configuring the environment each time. This reduces deployment errors and accelerates the development-to-production cycle.
- Resource management: Containers can help manage resource allocation more effectively. You can specify resource limits and constraints within the container, allowing for better control over CPU, memory, and network usage. This is important in multi-tenant or shared computing environments.
- Scaling and load balancing: Container orchestration platforms like Kubernetes and Docker Swarm make it easier to scale and load balance MPI applications. You can dynamically adjust the number of containers to match the workload, ensuring efficient resource utilization.
While containerizing MPI-based applications offers numerous advantages, it's essential to consider factors such as container orchestration, networking, and storage configurations to optimize performance and resource utilization in HPC environments. Properly designed and managed containers can simplify the deployment and scaling of MPI applications while maintaining compatibility and consistency across different systems.
Best practices for containerizing MPI applications:
Containerizing an MPI-based application with Docker involves some specific considerations to ensure that the application can run efficiently and scale properly within the containerized environment. Here are some general steps and best practices to follow:
- Linking your MPI Libraries dynamically: Certain HPC servers provide their own implementations for MPI based on the provisioned hardware. Make sure to compare performances of your chosen MPI implementation with the natively available MPI library, and link those to your application dynamically if required.
- Select a suitable base image: Choose a base image that matches the operating system and environment required for your MPI application. Common choices include CentOS, Ubuntu, or specialized HPC base images.
- Install MPI Implementation: Install the MPI library (e.g., MPICH, Open MPI, Intel MPI) inside the container. Ensure that the MPI version and configuration match your application's requirements.
- Set Up SSH or alternative communication: MPI applications often rely on SSH for communication between nodes. You may need to configure SSH inside the container or explore alternative communication methods, such as using MPI-specific communication libraries for container environments.
- Configure networking: Configure network settings to enable communication between containers on different hosts if your MPI application spans multiple containers or nodes. Consider using container orchestration tools like Kubernetes for MPI applications that need dynamic scaling.
- Test locally: Test the MPI application within a single container on a local system to ensure that it runs correctly and efficiently. Use a small number of processes initially for testing.
- Scaling and load balancing: If your application involves scaling to multiple containers or nodes, implement appropriate scaling and load balancing mechanisms. Concepts around container orchestration on HPC are still being researched, but if youโre purely attempting to experiment without performance considerations, a master-worker styled setup on top of Kubernetes/Docker swarm can prove to be effective.
Master-Slave setup in a Kubernetes environment for running a workload that uses MPI under the hood.
- Benchmarking: Multiple tools are available for automagically measuring the performance of the MPI library while itโs being utilised. Marmot is one such tool. For performing manual benchmarking, the preferred tool in the community is osu-benchmark.
- Cleanup and optimization: Remove unnecessary files and dependencies to optimize the size of the container image. Smaller images are faster to deploy and require less storage space.
- Security considerations: Ensure that security best practices are followed, including restricting container privileges, using non-root users, and applying security patches regularly.
- Testing across environments: Test the containerized MPI application on different hardware and environments to ensure portability and compatibility.
- Deployment automation: Implement automation scripts or tools to facilitate the deployment and management of containerized MPI applications, especially in production environments.
By following these steps and best practices, you can effectively containerize your MPI-based applications with Docker or any other containerization platform of choice, making them more portable, scalable, and manageable in various computing environments.
Hewlett Packard Enterprise empowers customers with high-performance computing requirements through a comprehensive suite of services encompassing migration, modernization, profiling, analysis, and optimization of HPC applications. Our offerings include services like configuring the right server environment to facilitate low-latency MPI communication, identifying and mitigating performance bottlenecks within MPI programs, containerization of legacy MPI applications, and a detailed profiling of customers' existing performance. These capabilities represent just a subset of the extensive array of services we provide to meet the diverse needs of enterprises.
Learn more about IT consulting services from HPE and how we help you accelerate edge-to-cloud transformation, optimize operations, and maximize IT operations.
Read about HPE Advisory and Professional Services โ expert advice and implementation to take your digital transformation to the next level.
A. Yuvaraja is Apps Modernization Delivery Head, Global Professional Services, HPE Services. Yuvaraja has spent years immersing himself in digital transformation initiatives for businesses of all sizes. He is passionate about helping organizations optimize their technology infrastructure, streamline processes, and achieve goals through seamless application transformation.
Priyank Rupareliya is a technical consultant at HPE. He is passionate about programming and architecting scalable applications. He is an avid OpenSource contributor and has worked on projects involving configuration management, fullstack webapps, AWS cloud and DevOps.
Services Experts
Hewlett Packard Enterprise
twitter.com/HPE_Services
linkedin.com/showcase/hpe-services/
hpe.com/services
- Back to Blog
- Newer Article
- Older Article
- Deeko on: The right framework means less guesswork: Why the ...
- MelissaEstesEDU on: Propel your organization into the future with all ...
- Samanath North on: How does Extended Reality (XR) outperform traditio...
- Sarah_Lennox on: Streamline cybersecurity with a best practices fra...
- Jams_C_Servers on: Unlocking the power of edge computing with HPE Gre...
- Sarah_Lennox on: Donโt know how to tackle sustainable IT? Start wit...
- VishBizOps on: Transform your business with cloud migration made ...
- Secure Access IT on: Protect your workloads with a platform agnostic wo...
- LoraAladjem on: A force for good: generative AI is creating new op...
- DrewWestra on: Achieve your digital ambitions with HPE Services: ...