Servers & Systems: The Right Compute

Intelligent data pipeline implementation for predictive maintenance

Keep up with the pace of data growth, get insights faster, and deliver more effective business outcomes. Find out how predictive analytics and automated cloud-native data pipelines from HPE can help.

Intelligent-Data-Pipeline.pngPredictive analytics is a critical application area for many organizations. With it, a business could estimate the lifespan, wear and tear, and performance of a component and make a remaining useful life prediction. If you can predict accurately when a component (e.g., engine or machinery) will fail, then you can plan maintenance downtime in advance to avoid sudden failures, reduce maintenance costs, better plan spare inventory, and systematically optimize business decision making.

In the past, organizations used to have schedule-based maintenance systems. In many cases, this approach focuses on reducing failures but results in replacing machines that could have been in service longer because the schedule assumes worst-case lifetimes for critical pieces of machinery.

Predicting a failure event is a complex challenge which is rarely solved with a single software component for near-real-time decision-making. It requires an end-to-end data pipeline solution to analyze data holistically and make informative decisions. 

A modular end-to-end data pipeline framework is essential for supporting predictive analytics and attaining the best business outcome. Data and analytics are emerging with improved fast data ingestion, advanced data processing techniques, and modernized analytics tools with real-time business decision capabilities.

But developing an efficient and easily maintainable data pipeline faces several challenges, including:

  • Building a decoupled microservices-based data pipeline framework with the right technologies and tools
  • Accommodating the requirements of heterogeneous workloads such as batch, near real-time and interactive analytics on single framework
  • Gluing functional modules in the data pipeline to handle heterogeneous data sources and rapid growing data volumes to get insights
  • Choosing and configuring the right set of hardware resources to attain the best performance and scale
  • Building a robust, full-stack pipeline with flexibility and modularity for each tier such as data ingestion, data processing, and data store
  • Avoiding vendor lock-in

Organizations find it complex and error-prone when implementing a data pipeline. It's difficult to select technologies and tools that are proven to work along with the right set of hardware configurations.

How HPE solves the data pipeline challenge

HPE addresses all these challenges. We've built an end-to-end microservices-based data pipeline framework and tested it for predictive analytics use case. The data pipeline architecture for predictive analytics solution consists of several layers including data ingestion, data processing, data store, data lake and data visualization along with AI/ML.

Our framework is fully automated with helm charts and designed with flexibility and agility in mind. Each functional layer is a module that can be used individually and can also be scaled out or down independently, however, each layer can also communicate and collaborate with other layers to perform integrated function seamlessly. The following diagram depicts the multiple layers required to build a data pipeline for a predictive analytics solution right from data ingestion to data storage to data visualization.


 Key features:

  • Simplified, scalable, flexible infrastructure for ease of management and optimal performance
  • Demonstrated intelligence of the data pipeline workflow and deployment through AI/ML: ingest historic data from the enterprise data lake, perform Exploratory Data Analysis and preprocess the data, explore algorithms, train and test the model, compare the models, select the appropriate model and make it deployment ready
  • Seamless integration to build and deploy the model into production for real-time prediction
  • Persist the model prediction data into the database for future model performance evaluation and audit purposes
  • Use of multiple nodes during training and testing of the ML model in the building phase to take advantage of the hardware resources for faster results
  • No vendor lock-in at any layer in the data pipeline framework
  • Deployed as a whole or can be plugged into the existing layer through integration.
  • Supports heterogeneous and multitenancy workloads

Our framework is designed with principle of modularity and flexibility at every stage of data pipeline. This platform supports data ingestion, data processing, data persistence, data visualization, Hadoop data store, along with ML development and deployment for implementing a predictive analytics solution. It assists organizations in the rapid design and deployment of data pipeline and gives customers the options to pick and choose the modules to be deployed based on their own environment, data strategy and business needs. It is fully automated with helm charts and provides integration with every layer of data pipeline. The whole pipeline can be deployed in minutes with security, data persistence and node affinity features.

Interested in learning more? Read my colleague Jing Li's blog "Intelligent data pipeline implementation builds business capabilities" for the business perspective behind the technology. And then contact us for more information!

Meet Compute Experts blogger Bhuvaneshwari Guddad, Solution Architect

Bhuvaneshwari-Guddad.pngBhuvana is associated with the HPE GreenLake Lighthouse and Enterprise Solutions organization and responsible for creating solution reference architecture/configuration for both bare metal and containerized big data and analytics.




Compute Experts
Hewlett Packard Enterprise

0 Kudos
About the Author


Our team of Hewlett Packard Enterprise server experts helps you to dive deep into relevant infrastructure topics.