- Community Home
- >
- Company
- >
- Behind the scenes @ Labs
- >
- HPE and Hortonworks collaborate to bring big-memor...
-
- Forums
-
Blogs
- Alliances
- Around the Storage Block
- Behind the scenes @ Labs
- HPE Careers
- HPE Storage Tech Insiders
- Infrastructure Insights
- Inspiring Progress
- Internet of Things (IoT)
- My Learning Certification
- OEM Solutions
- Servers: The Right Compute
- Shifting to Software-Defined
- Telecom IQ
- Transforming IT
- Infrastructure Solutions German
- L’Avenir de l’IT
- IT e Trasformazione Digitale
- Enterprise Topics
- ИТ для нового стиля бизнеса
-
Blogs
- Alliances
- Around the Storage Block
- Behind the scenes @ Labs
- HPE Blog UK & Ireland
- HPE Careers
- HPE Storage Tech Insiders
- Infrastructure Insights
- Inspiring Progress
- Internet of Things (IoT)
- My Learning Certification
- OEM Solutions
- Servers: The Right Compute
- Shifting to Software-Defined
- Telecom IQ
- Transforming IT
-
Quick Links
- Community
- Getting Started
- FAQ
- Ranking Overview
- Rules of Participation
- Contact
- Email us
- Tell us what you think
- Information Libraries
- Integrated Systems
- Networking
- Servers
- Storage
- Other HPE Sites
- Support Center
- Aruba Airheads Community
- Enterprise.nxt
- HPE Dev Community
- Marketplace
-
Forums
-
Blogs
-
Information
- Other HPE Sites
- Support Center
- Aruba Airheads Community
- Enterprise.nxt
- HPE Dev Community
- Marketplace
English
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Receive email notifications
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content
HPE and Hortonworks collaborate to bring big-memory Spark to the enterprise
Sriram Narasimhan, Tuan Bui, Jun Li, Mijung Kim, Alexander Ulanov, Manish Marwah, Hernan Laffitte, Haris Volos, (Not pictured: Carlos Zubieta, Tere Gonzalez, Janneth Rivera)
By Curt Hopkins, Managing Editor, Hewlett Packard Labs
Today, Hortonworks made a major announcement about Spark and HPE. Namely, Labs is helping make Spark better. Much better.
Driven by the motivation to make The Machine accessible to developers and demonstrate performance and scale beyond existing barriers, a cross-Labs and BU team led by Jun Li, Principal Research Scientist, set their sights on Spark for its in-memory focus. Apache Spark is a distributed in-memory analytics platform and the most active Apache project in big data.
“We wanted to test a hypothesis,” said April Slayden Mitchell, Director of Programmability and Analytics Workloads. “Can in-memory analytics perform better with big shared memory? We wanted to put Spark through the rigors to see if at The Machine scale we could surpass limitations of current memory bandwidth intensive workloads.” Possible use cases might include genome sequencing, probabilistic graph inferencing, and network flow analysis – all of which require largely random access over the irregular data structures with total sizes that can go up to 10’s of TBs or beyond.
Global shared memory
Today, said Li, “Spark uses disk-based storage for the data. Now, we read and write that data through globally shared memory. And that data is instantaneously accessible. ” This also means a tremendous reduction in the time and energy required.
The current method employed in Spark is to communicate intermediate processing results via TCP/IP, a very high latency, low bandwidth proposition, with a typical 0.1 millisecond of end-to-end latency and only 10 Gb/s of bandwidth in a cluster environment. This new Spark offering, however, has a “write/read paradigm for sharing data over globally shared memory,” said Li. It is now a low latency, high bandwidth proposition, with a remote memory access latency of only 210 nanoseconds and remote memory access bandwidth of 32 GB/s.
“We used global shared memory to turn Spark into a true in-memory data processing platform,” said Li, smiling. “And it’s much, much, much faster.”
As the team validated their findings, they realized they had more than a platform for The Machine, they had a platform and hardware they could put it in front of customers today.
That hardware was HPE’s Superdome X server.
A funny thing happened on the way to The Machine
“We are confirming here that scale-out and scale-up can both be of benefit to our customers,” said Mitchell. “With Spark, we’ve taken a scale-out platform and turned it into a scale-up platform while still maintaining the same user-level application programming interfaces, so that our customers can use familiar tools in new ways to go beyond current scale and performance limitations.”
By applying their approach for Memory-Driven Computing, Mitchell said, “We have demonstrated the value of large shared memory machines as extreme analytics beasts.”
Collaborating with Hortonworks – an industry innovator that creates, distributes, and supports enterprise-ready open data platforms – will allow Labs to contribute this code to the Apache Spark community. Customers will have access to the software, hardware, and support they need to keep up with their growing requirements for scalable analytics solutions.
Spark can already run on HPE Superdome X today, Li noted, “but can later run on The Machine. The move will be an instantaneous change because the software is the same.”
In addition to creating a highly improved Spark, this process proved again that, in addition to building toward a massive revolution in computer architecture, moving toward The Machine is producing radical improvements to already-existing technologies on the way.
Watch Labs Director Martin Fink talk about the partnership.
- Back to Blog
- Newer Article
- Older Article
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Receive email notifications
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content
- Dejan Milojicic on: Labs distinguished technologist talks about the fu...
- pernikahan on: (VIDEO) From the Lab: Novel accelerators for the f...
- Irene Ovonji-Odida on: Labs intern Elizabeth Liri wins Best in Class for ...
- Campbellja on: (PHOTO ESSAY) The cook in her kitchen: A photograp...
- Steve Shaw on: Stan Williams: a retrospective
- luis del rio sampietro on: HPE DISCOVER: Demos are the best way to lay your h...
- Hitoshi Yamazaki on: Sign up for the "Neuromorphic Computing: Brain-ins...
- logicprobe on: Replay available: Prepare for your Memory-Driven f...
- Matthias May on: How HPE Persistent Memory Furthers Our Vision of M...
- Clara Montalvo on: Beyond Moore's Law at Rebooting Computing
Hewlett Packard Enterprise International
- Communities
- HPE Blogs and Forum
© Copyright 2019 Hewlett Packard Enterprise Development LP

