Servers & Systems: The Right Compute
1753797 Members
7191 Online
108799 Solutions
New Article ๎ฅ‚
UliPlechschmidt

Meet the vibrant community driving the Lustre file system

It doesnโ€™t matter how big your organization is or what mission or business objective you pursue. If you're using modeling and simulation or artificial intelligence or high performance data analytics, HPE has the best parallel storage for you. You can start wherever you want, then go to wherever you need โ€“ without limits. 

As a recent study from Hyperion Research found, Lustreยฎ is the most widely deployed parallel file system in highHPE-HPC-Lustre-blog.png performance computing (HPC). But unlike most other file systems, Lustre is not owned by a single company but by a vibrant open source community.

Open Scalable File Systems, Inc. is the nonprofit organization dedicated to the success of the Lustre file system. OpenSFS was founded in 2010 to advance Lustre development, ensuring it remains vendor-neutral, open, and free. Cray โ€“ now part of the Hewlett Packard Enterprise family โ€“ was a founding member.

We are honored and proud to contribute to the Lustre community on multiple levels.

First, we contribute code to the community. Actually, as shared at the recent Lustre Administrators and Developers Workshop (LADโ€™21), HPE contributed more code commits to the current release of Lustre (2.14) than all other contributing organization combined (with the exception Whamcloud (now part of DDN) who traditionally contributes most of the code). In alphabetical order, contributing organizations to the 2.14 release were Aeon Computing, Amazon, CEA, Fujitsu, Google, HPE, Intel, Lawrence Livermore National Laboratory, Mellanox, Oracle, Oak Ridge National Laboratory, Seagate, Suse, Uber, and Whamcloud.

The graphic below shows the history of the two Lustre developer teams who, in addition to contributing user organizations, have contributed most development and testing since Lustre was launched in 2003.  Over time, the logos on the business cards have changed. The mission has not.

HPE-HPC-Lustre1.pngSecondly, HPE invests massively in the testing of Lustre community releases in order to make Lustre reliable โ€œat scale.โ€ Our Lustre research and development (R&D) team is doing this testing in our own internal labs, and also together with leadership sites implementing and debugging large scale file systems during the customer acceptance and early production phase. It is important to note is that we push the resulting patches upstream into the โ€œmasterโ€ code branch of Lustre so that the whole community can benefit.

Here are two specific examples of what we mean by โ€œLustre at scaleโ€:

The real heroes in this groundbreaking work are our customers who work hand in hand with HPEโ€™s Lustre R&D team during the acceptance and early production phases in order to push the capabilities of HPC storage in general โ€“ and Lustre in particular โ€“ beyond what was thought possible before.

Thirdly, HPE is sponsoring the annual events of the Lustre community and is sharing best practices with the community during those events.

There are two key community events each year:

  • The Lustre User Group (LUG) in spring in the U.S.A
  • The Lustre Administrators and Developers Workshop (LAD) in fall in France

At the Lustre User Group 2021 (LUG 2021) that was sponsored by both HPE and DDN, the Argonne Leadership Computing Facility shared in a  presentation with the title โ€œLustre at scaleโ€ lessons learned from the deployment and acceptance of the two identical Lustre file systems, โ€œGrandโ€ and โ€œEagle.โ€ Both are built on Cray ClusterStor E1000 Storage Systems with 100 petabyte of usable storage capacity in ten racks each. If you want to learn more about Lustre at scale, you can view the recording of the presentation here.

At LAD 2021 HPE contributed the following two best practices sharing talks:

  • Troubleshooting LNet Multi-Rail Networks โ€“ link to presentation recording here
  • An Aged Cluster File System: Problems and Solutions โ€“ link to presentation recording here

Both annual community events are a great opportunity for worldwide Lustre users, administrators, and developers to gather and exchange their experiences, developments, tools, best practices and more.

Lustre leadership and positioning demonstrated

When it comes to designing, deploying, and supporting Lustre-based storage systems at scale, nobody has more experience and expertise. The example of the 1 terabyte per second Cray ClusterStor system of the Blue Waters supercomputer at the National Center for Supercomputing Applications confirms this. In production now for over eight years, it has served high-speed data for over 40 billion core hours of jobs submitted by thousands of researchers and scientists across the United States of America. And still counting!

We sometimes get the question: Do you have a simple at-a-glance view of the positioning of your Lustre-based storage systems vs. your IBM Spectrum Scale-based storage systems? Weโ€™re using this graphic to answer that question.

HPE-HPC-Lustre2.png

Do any of these HPC and AI storage challenges sound familiar? Take action sooner than later

We can provide the right HPC/AI storage solution for organizations of all sizes, businesses, and mission areas. If you are currently facing one or more of the below HPC/AI storage challenges, do not wait any longer.

  • Job pipeline congestion due to input/output (IO) bottlenecks leading to missed deadlines/top talent attrition
  • High operational cost of multiple โ€œstorage islandsโ€ due to scalability limitations of current NFS-based file storage
  • Exploding costs for fast file storage at the expense of GPU and/or CPU compute nodes or of other critical business initiatives

Contact your HPE representative today for more information.

Learn more now

Business paper: Spend less on HPC/AI storage and more on CPU/GPU compute

Web: New HPC storage for a new HPC era


Uli Plechschmidt
Hewlett Packard Enterprise

twitter.com/hpe_hpc
linkedin.com/showcase/hpe-ai/
hpe.com/us/en/solutions/hpc

0 Kudos
About the Author

UliPlechschmidt

Uli leads the product marketing function for high performance computing (HPC) storage. He joined HPE in January 2020 as part of the Cray acquisition. Prior to Cray, Uli held leadership roles in marketing, sales enablement, and sales at Seagate, Brocade Communications, and IBM.