HPE Blog, UK, Ireland, Middle East & Africa
1819680 Members
4015 Online
109605 Solutions
New Article
Mattab

HPE Data Science Community Sessions ~ Volume 1

HPE Data Science CommunityHPE Data Science Community

 

Our new HPE Data Science Community is already flourishing. So far, we have held 2 in-person meetups in the UK and a further 2 virtually for our global audience. Most recently, I hosted Master Technologist and Machine Learning Engineer, Jordan Nanos, as well as co-founders of Machine Learning company, TitanML, Dr Jamie Dborin. Here's what they covered: 

Building a machine learning platform

 

Jordan Nanos looked at building a high-performance Machine Learning (ML) platform and how to combine the correct elements to ensure success. He first explained HPE’s ML history, before answering two crucial questions:

  • What is an ML Platform?
  • How can you go about building one today?

It’s been over a year since Generative AI captured the worlds imagination with the launch of ChatGPT. Jordan explained how HPE helps organisations understand what this means and how this is fuelling the necessity for organisations to look to develop effective ML platforms.

He then went on to outline the three main components of an effective ML platform, looking at:

  • Infrastructure
  • Platform Software
  • Models

He explained what it takes to deliver an ML platform, exploring the concept like a cake, with infrastructure services, platform services and model services making up each layer. Jordan explained how important each of these layers is, and the pitfalls of not using a similar structure.

Platform for LLMsPlatform for LLMs

 

Taking a deeper dive into the three aspects of a ML platform means looking at data, model development & training and subsequently deployment. This approach ensures best value and means successful deployments. Jordan highlighted just how complicated this can be, if you adopt a DIY model. With this being the case, he finished by presenting an example platform for LLMs, using the HPE ML Development Environment and outlining the combination of model development and optimisation, adding data processing and management functions using HPE ML Data Management software, and then using third party companies to help with model deployment and monitoring. 

Challenges and opportunities of self-hosted LLMs

 

Jamie Dborin PhD, CSO of TitanML then went on to explain how to use software tooling to make it easier to self-host language models. He looked at three elements:

  • Why self-host? Looking at the reasons to deviate from OpenAI and other API-based deployment alternatives.
  • The pros and cons of a self-hosted system, including the challenges involved.
  • What does a self-hosted stack look like?

Jamie kicked off looking at Retrieval Augmented Generation (RAG), the latest thinking when it comes to working with language models. The goal is to connect a ‘knowledge store’ with a language model to ground them in your data in a way that you control and will mean that any answer is explainable. He showed what a successful retrieval model might look like and then looked at the pros and cons of building a self-hosted system, explaining why it may be more successful than building on top of an OpenAI stack.

This approach does have pros and cons which Jamie explains. Self-hosting comes with a lower cost versus a ‘one size fits all’ style API model because of the reduced requirements. He also looked at how self-hosting improves performance by breaking down large tasks into smaller tasks that have specialised models in them. Which means more choice about how you implement LLMs into your environment. While the biggest benefit comes from the fact that you remain in complete control of your data. This improved privacy and security means commercially sensitive information, Intellectual Property and your copyright doesn’t leave your organisation, cross any borders or be handled by a third party. 

With so many benefits, Jamie takes a balanced view by looked at the challenges that can make self-hosting daunting. There is a level of complexity when it comes to getting the foundations in place as well as adding additional components into your AI platform architecture. All this needs to be managed, which has the incorrect perception of being difficult to do. It’s not as straight forward as simply calling an API, however, for many organisations, self-hosting is emerging as the preferred option. This is because of the far-reaching benefits around privacy, data loss prevention, scalability and increased control making it worth the effort. The Titan Takeoff Inference Server, TitanML’s flagship product offering, goes a long way to make this process of self-hosting as easy as API-calling. Jamie referred to the same process that Jordan alluded to in the previous presentation, outlining the three essential elements required in building a self-hosting stack:

  1. Something to process data - a data pipeline and managing system
  2. Something for model training and fine tuning
  3. Something for deployment and monitoring

He looked at how HPE’s offering of a development environment and data management capability combine with TitanML’s model deployment and monitoring element. Once you have a model you want to put into production, the next challenge is the deployment of it, and Jamie explained in more detail what TitanML’s Titan Takeoff Inference Server product can offer..

Takeoff combines many inference optimisation tricks including quantisation and continuous batching, and has many benefits for enterprise use cases. Jamie looked at three of these benefits:

  • Easy declarative interest - making it easy to spin up multiple models on a single device and useful for coordinating complicated applications.
  • Optimised inference - making the best use of GPU resources. Takeoff is designed for fast inference of models with minimum hardware requirements for maximum performance.
  • Flexible hosting - making sure it seamlessly integrates with common tools for building self-hosted stacks. Takeoff is optimised for NVIDIA GPUs, CPUs and AMD GPUs. It also works with a range of GPU cloud providers.

Packaged into Titan RAG engines as a single service, it is one of the easiest ways to get started using RAG applications. 

Find out more about HPE’s AI solutions here, and see how we can help you on your AI journey.

If you’d like to know more about TitanML’s offering, visit their website here.

And a final ask: We are keen to know what you would like to hear about next in this HPE Data Science Community! If there’s a topic you’d like discussed in more detail, do let us know. We’re keen to tailor this community to ensure we’re being as informative and useful as possible. Get in touch.


Matt Armstrong-Barnes
Hewlett Packard Enterprise

twitter.com/HPE_UKI
linkedin.com/company/hewlett-packard-enterprise
hpe.com/uk

0 Kudos
About the Author

Mattab

Matt is Chief Technologist for Artificial Intelligence in the UK&I and has a passion for helping customers understand how AI can be part of a wider digital transformation initiative.