AI Unlocked
1833923 Members
2611 Online
110063 Solutions
New Article
Ellen_Friedman

A Boon for Data Engineers: Object Store with HPE Ezmeral Data Fabric

Object storage is very popular and for good reason. But not all object storage options are equally useful. Some offer surprising advantages.

HPE20160726033.jpgTo understand the differences, consider the impact of object storage from the point of view of one major user group: data engineers. What are the challenges they face with large scale systems and how does object storage help?

Why object storage makes life easier for data engineers

It’s natural that data engineers are drawn to object storage. The use of objects -- easily accessible, immutable blobs of data – fits their needs in several ways:

The biggest benefit for data engineers is simple data access and ease of sharing. Objects are accessed using normal web protocols (http, https) eliminating the need to install drivers or mounting of file systems. Each object has a URL making it easy to access, both locally and remotely, simplifying data sharing and enabling data reuse between teams and use cases. 

Object storage is highly scalable, partly because of their immutability, which makes it easier to store data across multiple machines.  For one thing, object storage is highly scalable, partly because objects are immutable. Immutability makes it easier to store data on multiple machines, which is key to scalability, a critically important capability in modern systems.

But object storage is not the best fit for every use case. Data engineers need access to files, either to reach higher performance levels or because many applications require file-oriented APIs or data storage that is mutable. And file storage can support conventional databases, which object storage does not.

Surprising flexibility with HPE Ezmeral Data Fabric

HPE Ezmeral Data Fabric is an integrated data and analytics platform that simplifies data access patterns, data acquisition, processing, and surfacing of data to end users, applications, monitoring systems, or alerting dashboards. This happens through a series of standard APIs built into the solution that allows users, apps, and tools to access both files and objects from a single platform.   

HPE Ezmeral Data Fabric combines S3-native object store, files, event streams and database into a single platform that spans multiple physical clusters and multiple locations. The built-in object storage offers key advantages over other object-based solutions. Data fabric’s high performance object store optimizes all object sizes for performance and storage efficiency.

HPE Ezmeral Data Fabric’s multi-modal data support provides a high degree of data access flexibility for users, apps, and tools.  This advantage is illustrated in Figure 1.  

HPE Ezmeral Data Fabric object storage Fig1.PNG

Figure 1. Simultaneous access to file and object data from a single platform

Applications written to access data fabric objects via an S3-compatible API can also access data fabric files without having to modify the application. This unusual capability of HPE Ezmeral Data Fabric enables not only data engineers but also data scientists and analysts to use the data they need where they need it, taking advantage of new opportunities as they arise.

And keep in mind that objects are not always large. Another advantage of data fabric’s object storage is that it can handle many small objects, a situation that can be a performance problem with other systems. Data fabric handles small and large objects with reliability and excellent performance.

Getting value from data gets easier

Broad support for industry standard APIs and built-in certified ecosystem of open-source tools enables data engineers to use the tool of their preference to process and get insights from data no matter where it is located. It also enables data collaboration and sharing and reuse by global teams and multiple use cases. 

For details about the key advantages that HPE Ezmeral Data Fabric’s object storage offers over other object-based solutions, read this technical paper.

Hewlett Packard Enterprise

twitter.com/HPE_Ezmeral
linkedin.com/showcase/hpe-ezmeral
hpe.com/software




0 Kudos
About the Author

Ellen_Friedman

Ellen Friedman is a principal technologist at HPE focused on large-scale data analytics and machine learning. Ellen worked at MapR Technologies for seven years prior to her current role at HPE, where she was a committer for the Apache Drill and Apache Mahout open source projects. She is a co-author of multiple books published by O’Reilly Media, including AI & Analytics in Production, Machine Learning Logistics, and the Practical Machine Learning series.