Shifting to Software-Defined
Showing results for 
Search instead for 
Did you mean: 

4 Key Features of a Leading Big Data Hadoop Product Offering


Guest Post: Michele Nemschoff  is Vice President of Corporate Marketing at MapR Technologies


On February 27, 2014, Forrester Research Inc. published The Forrester Wave™: Big Data Hadoop Solutions, Q1 2014. MapR was among the select companies that Forrester invited to participate in this evaluation. MapR was cited as a Leader and achieved the highest score for Current Offering among all reviewed vendors. The evaluation covered 32 criteria in three different divisions. Our overall score in the “Current Offering” division was 4.25 out of 5. This was the highest score of the nine vendors included.


Our Current Offering:

The MapR product offering exceeds our competition for several reasons, one of which is the fact that our Hadoop features are unmatched. MapR is the only distribution that is built from the ground up for business-critical production applications.


The MapR M7 Enterprise Database Edition is our most robust product of our Current Offering. Here’s a look at four M7 product features that set MapR apart from other Hadoop distributions:


1. Distributed Metadata


The default Hadoop architecture uses a single NameNode to store the metadata. This forces all data into a bottleneck, and limits clusters to 50-200 million files. It also creates a single point-of-failure (SPOF). If the NameNode were to fail, the entire cluster would be useless.


Other distributions try to sidestep the problem by using a secondary NameNode. Secondary NameNodes run as a slave to the primary NameNode, and only replicate data from it on a periodic basis. This means that those depending on a secondary NameNode cannot trust its data integrity.


The only real solution to the NameNode problem is to remove it. With the MapR Distribution no-NameNode solution, there are no practical limits to the number of files that can be stored on MapR. This foundational change in the Hadoop architecture distributes the metadata amongst several nodes, which is illustrated below.



Photo credit: Architectural Overview of the MapR Apache Hadoop Distribution by M.C. Srivas via SlideShare; Slide 58


In addition to its benefits for dependability, its database performance boost is also remarkable. With only commodity hardware, you can gain 10-20 times the performance over all other distributions that utilize the centralized metadata structure.


This feature is an architectural improvement to Hadoop that MapR initiated in its infancy. The power it adds to our offering’s dependability and performance makes it untouchable by competitive offerings.


2. Low Latency


Your Hadoop infrastructure needs to be fast. Equally as important, it needs to stay that way. A dirty secret among many Hadoop distributions is the staggering volatility in performance and latency. The MapR M7 disk strategy obviates compactions and defragmentation that can affect performance. Because of this ability, MapR M7 achieves 5x better performance, with low 95th and 99th percentile latencies. The graph below compares the high performance and consistent low latency of the MapR M7 Edition in comparison to other Hadoop distributions.




Notice how M7’s highest point of latency is much lower than the other distributions. The difference in volatility is even more shocking. With M7, you can depend on a consistent low latency experience.


3. High Availability


High availability (HA) refers to the capability of a Hadoop system to continue functioning, regardless of multiple system failures. For companies running mission-critical applications, HA is a necessity.


The best way to ensure that your distributed system is highly available is by using an architecture that distributes the metadata. The MapR architecture increases performance and removes the SPOF.


The MapR Distribution for Hadoop provides high availability with self-healing and support for multiple failures. This means that your Hadoop infrastructure will be accessible during system failures, system upgrades and data recoveries.


4. Snapshots


Other distributions use the HDFS snapshot system, which has several downsides when compared to the MapR Distribution for Hadoop:


True Point-In-Time

HDFS snapshots only capture data that is closed at the time the snapshot is taken. If you are using snapshots as an automated recovery system, you will have no guarantees that the data is complete. With MapR, you can perform point-in-time recovery of all files and tables, whether they are open or not.


Supports All Applications

MapR Snapshots support all Hadoop applications by default.


No Data Duplication

MapR snapshots never duplicate your data and share the same storage with your live information. This allows clients to capture snapshots of a 1 petabyte cluster in just seconds.


As we look at these features that are exclusive to MapR, it seems obvious why our customers are continually excited about our product offering. We feel this was made apparent in our scores in the previously mentioned independent evaluation.


In our opinion, the results of that product evaluation are just another testament to the caliber of our product offering. We continue to push ourselves with our product offerings, and look forward to more recognitions like this in the future.

Senior Manager, Cloud Online Marketing
0 Kudos
About the Author


I manage the HPE Helion social media and website teams promoting the enterprise cloud solutions at HPE for hybrid, public, and private clouds. I was previously at Dell promoting their Cloud solutions and was the open source community manager for OpenStack and at Rackspace and Citrix Systems. While at Citrix Systems, I founded the Citrix Developer Network, developed global alliance and licensing programs, and even once added audio to the DOS ICA client with assembler. Follow me at @SpectorID

June 19 - 21
Las Vegas, NV
HPE Discover 2018 Las Vegas
Learn about all things Discover 2018 in Las Vegas, Nevada, June 19 - 21, 2018.
Read more
See posts for dates
See posts for locations
HPE at 2018 Technology Events
Learn about the technology events where Hewlett Packard Enterprise will have a presence in 2018.
Read more
View all