Driving the AI/ML industry forward through open source, open stacks, and open communities

HPE_Alliances · ‎03-09-2023

Evan Sparks, Chief Product Officer for Artificial Intelligence, HPE
Isha Ghodgaonkar, Machine Learning Developer Advocate for Determined AI, HPE
Hayden Barnes, Senior Open Source Community Manager, HPE

Machine learning space speeds up, opens up

Artificial intelligence (AI) and machine learning (ML) have been around for decades in some form—with early beginnings traceable to developments in the 1940s. As modern ML evolved, many development stacks were closed and proprietary, but an open source ML revolution was coming.

As the need for AI-powered solutions escalated dramatically in the 2000s, the ML space has also exploded, with new AI/ML software and services being developed by startups, legacy tech companies, open source projects, and chip and cloud vendors.

And so today, there is a wide range of AI/ML tools to choose from, which on the surface appears to be a positive attribute. However, this wide proliferation of options, services, and platforms has left organizations wondering which tools to use.

The first step in any ML journey is to choose a development stack, and a key decision is whether to go closed or open. Choosing a closed AI/ML stack is a risky endeavor because any proprietary solution comes with the inherent gamble on the long-term sustainability of that solution. In addition, proprietary solutions:

Raise questions about compatibility with other AI/ML tools
Can create interoperability issues with other components of their AI/ML stack
Are more expensive than open solutions
Lock in customers to single vendors
Take away control over the customer’s stack

At Hewlett Packard Enterprise, we feel a better, more flexible, adaptable, and cost-efficient way to pursue AI/ML development is using open source software (OSS).

Considerations to ponder

Early website developers turned to the LAMP stack to build Web 1.0. As the canonical open source LAMP stack emerged—which dramatically increased productivity and the speed of website development—the web exploded. The same principle applies to AI/ML today. OSS not only speeds AI/ML design and development, but it also makes AI/ML equally accessible to everyone and accelerates development.

Let’s also consider that most organizations today are becoming increasingly data-driven—generating massive amounts of data and using that data to make more and better business decisions. Every data-driven workload—including data science, deep learning models, and large language models—is extremely compute-hungry and compute-intensive. Training ML models requires a great deal of time and resources.

As the considerations and choices for successful ML mount up, all indicators point toward full-stack thinking from the infrastructure to software to models. Success in ML requires openness, with natural stacks emerging from an ecosystem of collaborative partners and the open source AI/ML community. This community embraces more than 40,000 highly engaged data scientists, data engineers, and C-suite executives focused on AI/ML from large cutting-edge companies from around the world:

Top Fortune 500 aerospace/defense
Top consulting firms
Major entertainment companies
Major automobile companies
35+ venture capital firms
25 financial services organizations
28 biotech/pharmaceutical companies
14 retail organizations
100+ internet companies
300+ technology (software, hardware, and services companies)
5 international airlines

Delivering what ML customers need

This full-stack thinking led HPE to develop the HPE Machine Learning Development Environment (MLDE), built on the open source Determined AI. The goal of HPE MLDE is to deliver a cohesive experience to customers, with all the pieces of the ML workflow logically grouped together and running on high-performance NVIDIA GPU–powered infrastructure. For example, Transformer Engine is a fundamental open source library to use FP8 precision with NVIDIA Hopper architecture GPUs. Transformers enable generative AI, and full-stack support brings performance and energy efficiency benefits to large language model frameworks that incorporate these simple APIs. The entire ML environment can be hosted on your on-premises hardware, on dedicated HPE hardware, or delivered via the HPE GreenLake edge-to-cloud platform. HPE MLDE is optimized to reduce the time and resources needed for AI/ML development.

The importance of an ecosystem…

At HPE, our business strategy is to support our customers’ open AI/ML development stacks. One way we offer support is by producing and maintaining Determined AI upstream of the HPE MLDE. This way, our customers can access and leverage best-in-class ML tooling for distributed training, rather than building their own tools from scratch—saving significant time and resources.

Another critical component of our strategy is delivering a broad, deep ecosystem of tools for AI and analytics via HPE Ezmeral Unified Analytics. This managed, open source data analytics solution provides curated, best-of-breed analytic software tools such as Apache Spark and Airflow to develop and deploy applications across hybrid environments.

In addition, HPE offers tools like Pachyderm (now acquired by HPE) to provide a fundamental layer for AI/ML data versioning and lineage. Working in partnership with open source tools companies, HPE is also developing solutions for inference and model deployment, optimization, and monitoring.

Leveraging years of experience working in the AI and ML space, HPE and NVIDIA are helping customers understand this complex landscape and identify which tools can solve their data science problems. And by working with forward-thinking companies delivering AI at scale, we can begin to offer customers a roadmap that outlines the right way to solve ML problems using the right technology and tools.

…and standards

As we continue to focus our efforts on open source full-stack solutions, we see an increasing need for an industry-wide push for a governance model that encourages open standards and interoperability. By adopting open standards, supported by a new industry governance model, we guard against the proprietary conflicts of the 1990s.

In an effort to rally the OSS community to collaborate and drive open software tools for ML, the AI Infrastructure Alliance (AIIA) was established. The goal of AIIA is to create a robust collaboration environment for companies and communities in the AI/ML space. Knowing that one ML stack will not meet every organization’s needs, the AIIA encourages collaboration between projects and companies to develop best practices for AI/ML, foster openness, and work toward universal standards to share data between AI/ML applications.

To help customers understand how things fit together, AIIA has published a map of the AI/ML infrastructure ecosystem. This blueprint is used by many companies to create their own stacks, custom-tailored with the tools that meet their unique needs.

Governance and standards are still in development by the members of the AIIA. Once the right governance model is in place, standards can be set across the industry, best-of-breed solutions will be interoperable across stacks, customers will enjoy greater freedom of choice, and true competition will continue based on merit.

A glimpse into the future of ML problem-solving

HPE, in collaboration with NVIDIA, has developed a new 20-node supercomputer called Champollion—leveraging the best-of-breed technology from both companies, including HPE Cray systems. With roughly 200 NVIDIA A100 Tensor Core GPUs and NVIDIA Quantum InfiniBand networking, Champollion delivers massive scalability unavailable from customers’ on-site systems or in the cloud—enabling customers to solve their biggest AI/ML problems.

Learn more at NVIDIA GTC

Join this simulive session on March 22 at 4:00 p.m. ET for an in-depth discussion on the HPE MLDE, hosted by Evan Sparks, Chief Product Officer for Artificial Intelligence at HPE, and Isha Ghodgaonkar, Machine Learning Developer Advocate for Determined AI at HPE. Take a deep dive into the topics we touched on here and discover how your organization can use the HPE MLDE to drive your ML projects forward, faster.

Join HPE, along with other AI developers and innovators, at NVIDIA GTC, March 20–23, 2023. Register for the conference today.

Check out all the HPE sessions at GTC:

For more information about HPE and NVIDIA solutions for AI, please visit our partnership page.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Driving the AI/ML industry forward through open source, open stacks, and open communities

HPE_Alliances