- Community Home
- >
- Solutions
- >
- Tech Insights
- >
- Performance-enhanced deep learning models for the ...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Receive email notifications
- Printer Friendly Page
- Report Inappropriate Content
Performance-enhanced deep learning models for the edge
To ensure your AI solution meets accuracy and performance requirements, you have to consider multiple factors when deploying production AI models. Read what HPE's Kenneth Leach and Deci's Sefi Kligler have to say about how HPE is working with partners like Deci to provide the expertise, people, and technology to achieve a performance-enhanced solution and accelerate business outcomes with AI.
What matters when deploying AI at the edge?
We’ve all heard the stories: AI projects are getting stuck at the “last mile” and not making it into production with the “reasons why” not always discussed. Although many factors contribute to the success or failure of an AI solution, deep learning model accuracy is broadly recognized as an important success criterion. However, accuracy is not the only important factor. After they are initially trained for the intended application, deep learning models often have compute capability requirements that are incompatible with other requirements, such as cost-effectiveness and power-efficiency, of the production solution. For example, many video-based computer vision models require fast inference results on high temporal resolution data.
Though trained to the desired accuracy, deep learning models that were developed offline in a data center may not be able to provide insights fast enough on the smaller, low-power systems that are available at the edge. In this blog, I’ll discuss AI inference metrics that are often critical to successfully transform an AI project into a successful, performance enhanced AI solution. I’ll also demonstrate how HPE, Deci, and Intel partnered to solve a real production problem.
Reducing latency, resources, and complexity
Latency and resource requirements as well as deployment complexity can hinder putting a trained AI model into production, even one that delivers the required accuracy for the problem.
Let’s look at an example using object detection. Object detection models recognize and classify objects in images that match a trained set of categories, for example, people, cats, dogs, laptops, bottles, computer components, and traffic signs. State-of-the-art (SOTA) models such as EfficientDet, YOLOv5, RetinaNet, DETR, and SSD among others can achieve high levels of accuracy which often contributes to a successful AI solution. However, poor runtime performance such as slow latency or insufficient throughput might prevent successful deployment.
Similarly, HPE and Deci recently worked with a computer vision solution provider struggling to meet performance requirements. The intent was a production object detection solution based on YOLOv5 to identify objects in city streets, but the model proved incapable of meeting the required video frame-rate performance when deployed to the target inference hardware. In this case the customer was deploying the solution on edge-specific hardware, ruggedized HPE Edgeline servers with power-efficient Intel® Xeon® processors.
While the YOLOv5 model is considered by many to be SOTA in terms of accuracy and latency, it was built for running on GPUs. HPE partnered with Deci to optimize the customer’s model using Deci’s Automated Neural Architecture Construction Technology (AutoNAC™), which in the end created a performance enhanced AI solution which exceeded all production requirements. The AutoNAC™ solution is designed for solving this problem and enhances deep learning models’ performance on a range of HPE platforms.
Here's how it works: The challenge was to improve the model’s performance on Intel® Xeon® Scalable Processors, which Deci accomplished while adhering to strict memory and accuracy constraints. One way to increase throughput is to decrease inference latency, or the time it takes for the model to provide a result for a given image. Deci’s AutoNAC™ engine delivers algorithmic level model optimizations for any target hardware based on a proprietary neural architecture search (NAS) engine. The AutoNAC™ engine needs a baseline model as input, the dataset used to train this model, and access to the target inference hardware platform to monitor model performance. The solution then identifies and removes bottlenecks in the model architecture and redesigns a hardware-optimized neural network with more accuracy, higher throughput, lower latency, smaller model size, or smaller memory footprint than the original.
In the table below, AutoNAC Optimized Performance Results shows initial test results of the unoptimized TensorFlow-based YOLOv5 model with latency of 900ms on HPE Edgeline and mean average precision (mAP) of 0.63. The AutoNAC™ solution optimized the runtime performance of the model and reduced the latency 12-fold to 70ms. The optimized model accuracy, defined by the mAP, when compared to the original also improved by 25% using Deci's SuperGradients open source training library. Once optimization was complete, the model was integrated into the image processing container using Infery, Deci's runtime engine with just a few lines of code. Final testing verified that the optimized model, when deployed on the target hardware, met the frames-per-second requirement without even optimizing any other parts of the inference pipeline. Now that is enhanced performance!
AutoNAC Optimized Performance Results
Considerations for model deployment
Understanding accuracy and performance requirements throughout the AI solution development cycle is critical for a timely and performance-enhanced solution. Accuracy, runtime performance including latency and throughput, and deployment environment are all important factors for a successful AI solution. It is critical to evaluate these factors when deploying models.
For example, no matter how accurate a model is, if it doesn’t meet inference performance requirements, then it might not succeed in production. Likewise, no matter how quickly a model can generate inference results, if it is not accurate enough, then it will not meet the success criteria of the overall solution. Why do these factors matter when deploying AI into production environments and what tools can be used to enhance performance?
- Model design, accuracy, and performance
Deep Learning models are growing larger and more complex. This is partially driven by the need to reach higher accuracy rates and enable more advanced use cases. However, as model sizes increase, it is more challenging to deploy these advanced models onto edge devices. An efficient model design process which takes into consideration the inference environment and deployment hardware early on, can yield a much smaller model that will better meet resource requirements of edge devices as well as achieve the desired accuracy and performance. Once a model is deployed into production, accuracy can change over time. Understanding how these factors affect accuracy will help your solution continue to meet requirements throughout its lifecycle. Deci's SuperGradients open-source training library can be used to maintain accuracy over the lifetime of a solution.
- Deployment hardware and resource requirements
In many cases, AI models are trained and tested on large high performance computing (HPC) clusters that have significant compute capacity and accelerated processors, such as GPUs, that might not be available in the edge deployment environment. For example, edge environments may necessitate less compute-intensive processors due to constraints on space, cost, power, or cooling. Choosing hardware that is designed for performance in edge environments is key for successful deployment. HPE Edgeline and HPE ProLiant Gen10 Plus platforms provide an open standards-based, high-performance, low latency system for the most demanding use cases powered by third-generation Intel® Xeon® Scalable Processors.
- Time to decision
Ensuring that a production model meets latency requirements is also critical to a successful AI solution. A late prediction may be as useless as an incorrect one. To enable real-time decisions close to where data is generated, edge deployments often have very low latency requirements. To successfully put your model into production, you may have to improve its inference latency through optimization and retraining. Deci’s AutoNAC™ engine delivers optimizations for any target hardware.
- Deployment environment
AI model deployment can be complex and hard to manage over time due to version control, deployment automation, and continuous integration/continuous deployment (CI/CD) development cycles. Having effective visualization, monitoring, and CI (Continuous Integration) capabilities is a requirement for MLOps pipelines. These pipelines allow organizations to resolve matters before they become future issues and maintain model confidence. It also enables corrections for model drift and delivery of updated models seamlessly over time. HPE Ezmeral ML Ops provides pre-packaged tools to operationalize AI workflows at every stage of the AI lifecycle, from pilot to production, giving you DevOps-like speed and agility.
Ready to move forward?
Multiple factors should be considered when deploying production AI models to ensure the AI solution meets accuracy and performance requirements. HPE provides the expertise, people, technology, and partners to achieve a performance enhanced solution and accelerate business outcomes with AI.
Fine more information in this solution brief.
Question? Please contact HPE at: AIAdvance@hpe.com
Meet our Tech Insights Experts bloggers
Kenneth Leach, AI Technologist & Solution Architect, HPE Kenneth has worked within HPE servers and systems engineering teams since 2006, with specialties in scalable systems, HPC, edge computing, IoT and AI solutions. He has created numerous solutions in emerging technologies during his time at HPE. He has a B.A. in Computer Science from The University of Texas at Austin.
Sefi Kligler, VP of AI, Deci Sefi was granted an M.Sc. in Math and Computer Science from the Weizmann Institute of Science. His thesis focused on Super Resolution, and image degradation estimation using deep learning. He also worked in the field of multiple view geometry, developing a helmet for aviation.
Insights Experts
Hewlett Packard Enterprise
twitter.com/HPE_AI
linkedin.com/showcase/hpe-ai/
hpe.com/us/en/solutions/artificial-intelligence.html
- Back to Blog
- Newer Article
- Older Article
- Amy Saunders on: Smart buildings and the future of automation
- Sandeep Pendharkar on: From rainbows and unicorns to real recognition of ...
- Anni1 on: Modern use cases for video analytics
- Terry Hughes on: CuBE Packaging improves manufacturing productivity...
- Sarah Leslie on: IoT in The Post-Digital Era is Upon Us — Are You R...
- Marty Poniatowski on: Seamlessly scaling HPC and AI initiatives with HPE...
- Sabine Sauter on: 2018 AI review: A year of innovation
- Innovation Champ on: How the Internet of Things Is Cultivating a New Vi...
- Bestvela on: Unleash the power of the cloud, right at your edge...
- Balconycrops on: HPE at Mobile World Congress: Creating a better fu...
-
5G
2 -
Artificial Intelligence
101 -
business continuity
1 -
climate change
1 -
cyber resilience
1 -
cyberresilience
1 -
cybersecurity
1 -
Edge and IoT
97 -
HPE GreenLake
1 -
resilience
1 -
Security
1 -
Telco
108