- Community Home
- >
- Storage
- >
- Around the Storage Block
- >
- Handle massive IoT data with HPE BlueData and Qumu...
-
- Forums
-
- Advancing Life & Work
- Advantage EX
- Alliances
- Around the Storage Block
- HPE Ezmeral: Uncut
- OEM Solutions
- Servers & Systems: The Right Compute
- Tech Insights
- The Cloud Experience Everywhere
- HPE Blog, Austria, Germany & Switzerland
- Blog HPE, France
- HPE Blog, Italy
- HPE Blog, Japan
- HPE Blog, Middle East
- HPE Blog, Russia
- HPE Blog, Saudi Arabia
- HPE Blog, South Africa
- HPE Blog, UK & Ireland
-
Blogs
- Advancing Life & Work
- Advantage EX
- Alliances
- Around the Storage Block
- HPE Blog, Latin America
- HPE Blog, Middle East
- HPE Blog, Saudi Arabia
- HPE Blog, South Africa
- HPE Blog, UK & Ireland
- HPE Ezmeral: Uncut
- OEM Solutions
- Servers & Systems: The Right Compute
- Tech Insights
- The Cloud Experience Everywhere
-
Information
- Community
- Welcome
- Getting Started
- FAQ
- Ranking Overview
- Rules of Participation
- Tips and Tricks
- Resources
- Announcements
- Email us
- Feedback
- Information Libraries
- Integrated Systems
- Networking
- Servers
- Storage
- Other HPE Sites
- Support Center
- Aruba Airheads Community
- Enterprise.nxt
- HPE Dev Community
- Cloud28+ Community
- Marketplace
-
Forums
-
Blogs
-
Information
-
English
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Receive email notifications
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content
Handle massive IoT data with HPE BlueData and Qumulo
You can create a unified and secured environment for all of your AI workloads.
Todayโs Big Data and AI workloads, especially in the IoT world, demand solutions that provide real-time streaming (i.e. Hardware and Software in the loop), handle complex measurement data, scale-out in petabytes, and offer access for multiple analytics teams.
Supporting different AI/ML and analytics workloads, while also meeting the need of Data Analysts to securely access all the data using their preferred AI tools, is quite a challenging task for any IT manager or team. Many times, IT organizations find themselves addressing these needs using multiple clusters, or a combination of compute and storage resources. Unfortunately, such environments are complicated to operate, create data silos and replication, donโt facilitate hardware and software optimization, and are intrinsically unsecure.
Thatโs the kind of stuff that keeps an IT manager awake night.
I do have good news to share, though. With HPEโs experience in complex AI and analytics environments, we have been able to integrate the HPE BlueData EPIC and Qumulo file system to provide customers like you with a simple,flexible and comprehensive AI environment.
We built out the following lab scenario based on a real customerโs IoT process:
- Qumulo FS provided a software-defined distributed file system designed specifically to support the enterprise. IoT data is generally provided in a special file format (e.g. MDF4), then converted into a different file format (Parquet/Avro) to optimize and speed up queries, which also allows efficient data compression and encoding schemes.
- HPE BlueData EPIC provided an enterprise-grade governance and security platform to manage multi-tenant environments. It allowed multiple analytics teams to access all the data they needed. The EPIC tool also created advanced analytics models, using Big Data scale-out environments such as Hadoop & Spark, to then utilize their preferred analytics tools.
HPE BlueData and Qumulo as a complement to create an enterprise-level AI environment
The goal of the solution tested was to leverage the HPE BlueData EPIC Software to create a single namespace containing multiple environments that could access data from a Qumulo environment. The results would then be analyzed for performance during the integration process.
- Qumuloโs software defined distributed file system is a scale-across solution capable of managing billions of files, large or small, seamlessly across an Enterprise environment.The file system offered real-time visibility, scale, and control of data without performance degradation, while also providing centralized access to files.
- HPE BlueData Software provided a platform for distributed AI, machine Learning, and analytics on containers. HPE BlueData EPIC Software provided multitenancy and data isolation to ensure logical separation between each project, group, or department within the organization. By creating a single environment that not only contained massive datasets, the software also enabled multiple teams to use their analytical tool of choice to work on the data. The result? Cluster sprawl and redundant copies of test data were avoided or eliminated entirely.
- HPE BlueData EPIC is integrated with Qumulo file systems via EPIC DataTap. EPIC DataTap implements a high-performance connection to remote data storage systems via NFS and HDFS. This allows unmodified Hadoop and AI/ML applications to run against data stored in the remote NFS and HDFS without any modification or loss of performance. All data can be managed as a single pool of storage (single namenode), so no data movement is needed.
A Spark Analytics engine was used to test the performance from HPE BlueData clusters (Cloudera and Spark) to the Qumulo File System. We tested the Spark clusters connected to Qumulo data via DataTap first, and then we compared the results with Spark clusters directly connected, via NFS mounts, to the same Qumulo environment.
The testing results confirmed that there was no performance degradation when using DataTap compared to using an NFS mount. The HPE BlueData and Qumulo environments were vanilla installations โ with no performance or configuration tweaking required.
Here's a demo that Calvin Zito got showing Qumulo in action - but note this is not with HPE BlueData.
Conclusion? A flexible analytics environment
The Lab teamโs testing demonstrated that the integration of HPE BlueData and Qumulo can provide a single, flexible environment where multiple clusters are created that could ultimately deploy the same tool to work with the same dataโwith no cluster sprawl or dealing with multiple storage systems.
Data scientists and analysts can focus on analyzing the test data with their preferred tools, while DataTap allows them to access all data lakes without having to create multiple copies and without a performance penalty.
To find out more about how to Handle massive IoT data with HPE BlueData and Qumulo please check out our References Source list:
https://qumulo.com/blog/hybrid-storage-solution-for-adas-development-and-simulation/
https://qumulo.com/wp-content/uploads/2019/01/CS-Q141-Hyundai-MOBIS-1.pdf
https://qumulo.com/wp-content/uploads/2019/05/SB-Q168-ADAS-A4.pdf
https://sparkbyexamples.com/spark/spark-read-write-dataframe-parquet-example/
Meet HPE blogger Eric Brown, an HPE Solutions Engineer who has worked with HPE Storage for over 20 years. His areas of expertise are Big Data and Analytics.
Storage Experts
Hewlett Packard Enterprise
twitter.com/HPE_Storage
linkedin.com/showcase/hpestorage/
hpe.com/storage
- Back to Blog
- Newer Article
- Older Article
- StorageExperts on: Power Loss at the Edge? Protect Your Data with New...
- StorageExperts on: HPE Primera Storage celebrates one year!
- Ron Dharma on: Introducing Language Bindings for HPE SimpliVity R...
- aurangabad on: AIOps for VMs: HPE InfoSight Performance Recommend...
- CalvinZito on: Investment protection with 64G FC readiness using ...
- marcusburrows on: How to update HPE Nimble dHCI Compute Node firmwar...
- Ruben_Ramirez on: HPE InfoSight for Servers expands across the portf...
- CalvinZito on: There's more storage coming on July 8 at Discover ...
- SVT_Matt on: Expand backups to secondary storage: HPE SimpliVit...
- StorageExperts on: Introducing MSA Health Check: Insights into array ...
-
Containers and DevOps
1 -
Data Storage
335 -
Hyperconverged
122 -
IT Automation and AIOps
4 -
Technical
114
Hewlett Packard Enterprise International
- Communities
- HPE Blogs and Forum
© Copyright 2021 Hewlett Packard Enterprise Development LP