HPE Ezmeral: Uncut

Fraud Analytics on HPE Elastic Platform for Analytics (EPA)

HPE-Ezmeral-Data-Fabric-EPA-Main.pngWe live in a digital world that has created both opportunities and threats. Banking, financial services, and insurance (BFSI) organizations are on the eve of a major transformation with the latest technologies. At the same time, sophisticated cybercriminals are getting smarter, leveraging technology for their benefit. With fraud becoming more prevalent across the BFSI segment, organizations are finding it more challenging to implement efficient systems for detecting and preventing fraud.

Traditionally, organizations were using conventional methods like rule-based systems (manual processes) and statistical methods to detect the pattern and prevent fraudulent incidents. However, these systems have their limitations and are not powerful enough in today’s technologically advanced world.

Nowadays, we harness big data and apply analytics to build a powerful model to predict fraudulent incidents in real time. This process enhances fraud detection capabilities and gives a new dimension to fraud detection techniques. To build such a solution, a powerful infrastructure with the right mix of compute and storage is required. In real time, it must scale seamlessly to handle the massive amounts of data, process it, and predict outcomes.

Some challenges customers face while building a fraud analytics solution include the following:

  • What should be the configuration of infrastructure that is flexible, scalable, easy to manage, and performance optimized?
  • What Hadoop eco-system components do customers need  to  build an end-to-end data pipeline that supports seamless integration?
  • How do customers ensure their solution supports a heterogeneous platform?
  • Is data persistence supported for both newly ingested data and the predicted data?
  • Which machine learning model would be ideal?

HPE’s solution addresses all the above-mentioned challenges and helps organizations detect and prevent fraud.   It consists of optimized infrastructure; which is scalable, distributed, and flexible. It can capture the data and process it to predict the event.

The HPE solution is based on HPE Elastic Platform for Big data Analytics (EPA), HPE Synergy, and HPE Ezmeral Data Fabric software.

HPE Elastic Platform for Big Data Analytics (EPA) is designed with disaggregated compute and storage blocks, delivering flexibility and scalability to support big data and analytics workloads. HPE EPA is composed of modular infrastructure building blocks, each of them optimized for storage density, heavy computation, networking, or standard compute activities. By combining the right mix of building blocks, customers can build infrastructures optimized for their workloads.

HPE Composable Infrastructure is a compute tier in the HPE EPA solution that brings several advantages. It’s multi-tenancy capability provides easy resource management to mix and match a wide range of workloads and applications. Heterogeneous deployment, reduced network complexity, fluid resource pools, and software-defined intelligence allows for dynamic resource allocation and reorganization in minutes.

HPE Ezmeral Data Fabric software  ingests, stores, and manages data on a vast scale to make it readily available to new computation techniques and tools. The HPE Ezmeral Data Fabric is the industry-leading data platform for AI and analytics.

Below is the high-level solution architecture to build an end-to-end data pipeline for effective fraud detection and prevention:

Figure 1. High-level Solution ArchitectureFigure 1. High-level Solution ArchitectureThe diagram depicts the end-to-end data pipeline to implement a fraud analytics solution with various stages as described below:

  • Ingesting data from diverse data sources using publisher and subscriber (Kafka API) and HPE Ezmeral Data Fabric event store.
  • Data transformation/processing using Spark, which takes the raw data, extracts the features, processes it, and outputs the prediction to the persistent data store.
  • The HPE Ezmeral Data Fabric database or HBase is a distributed, scalable, big data store, providing fast real-time read and write access. It persists the model prediction data providing for insightful event prediction. This capability can be used along with the historic data already residing in the data store.
  • The enriched data can be visualized using dashboards or visualization applications.

The key benefits of the solution are as follows:

  • Helps to build a strong, end-to-end data pipeline (workflows) to enable seamless flow of data and also a pipeline for machine learning.
  • Supports seamless integration with the Hadoop ecosystem components to build and deploy the model into production for real-time prediction. Examples include: data ingestion using Kafka, streaming using Spark, data persistence using HPE Ezmeral Data Fabric Database and HPE Ezmeral Data Fabric, and data visualization using Zeppelin.
  • Capable to persist the data for both newly ingested and predicted data.

Summary and key outcomes:

The HPE Elastic Platform for Analytics (EPA) with HPE Ezmeral Data Fabric is designed to be a modular infrastructure foundation that can be scaled in a building block fashion to support fraud analytics solutions and requirements. This platform supports data ingestion, data processing, data persistence, data visualization, Hadoop data store, along with machine learning development and deployment. See below key outcomes of the HPE solution based on the configuration and >testing performed in the lab using publicly available dataset:

  • Performance scales linearly while training the model on multiple nodes with increased load from 250K records to 650K records.
  • Training and validation of more than 250K records in seconds.
  • Streaming data performance also scaled linearly for varied data size like 600 Million messages to 1200 Million messages.

Please note: These numbers are subjective to the configuration, testing, and dataset.

For more details on the solution, please refer to the following: HPE Reference Architecture for Fraud Analytics on HPE Elastic Platform for Analytics

About the author:

Bhuvana.pngBhuvaneshwari Guddad 

Bhuvana is associated with Element Platform Organization and responsible for creating Solution Reference Architecture/Configuration for big data and analytics.




Hewlett Packard Enterprise


About the Author