Around the Storage Block
Showing results for 
Search instead for 
Did you mean: 

Welcome to Flash 2.0: HP 3PAR Thin Deduplication with Express Indexing


Ivan.jpgBy Ivan Iannaccone, 3PAR StoreServ Product Management


With the introduction of HP 3PAR Thin Deduplication and Thin Clones software, HP 3PAR StoreServ continues to set the gold standard for hardware-accelerated thin technologies that drive up capacity efficiency and extend flash media lifespan. Together with cMLC SSD and Thin Deduplication, HP 3PAR StoreServ truly enables a flash-based solution at HDD-based costs.  Welcome to Flash 2.0!




How the magic happens

HP 3PAR StoreServ has two secret sauces: the ASIC and the highly virtualized Architecture. Thin Deduplication is the perfect blend of hardware and software, with the ASIC generating hashes and assuring data integrity and the HP 3PAR OS handling metadata and volume management between the various virtualization layers.


The implementation of Thin Deduplication is influenced by 3PAR StoreServ support for Mixed Workloads, to serve concurrently different I/O Streams (sequential/random) of variable I/O Sizes. As writes are received, they are segmented in Cluster Memory Pages (CMPs). The CMP is 16KB – the same size that has made HP 3PAR Thin Provisioning so successful. It’s a perfect balance for:

  • Performance – CPU interrupts, Host I/O sweet spot, Host I/O average block size, metadata handling AND
  • Efficiency – granularity of write updates, deduplication and space reclamation

As a new I/O write comes in, HP 3PAR Express Indexing performs metadata lookups to compare the signatures of the incoming request to signatures of data already stored in the array. The technology uses the computed hash signature as an index to lookup a match using a three-level translation exception table mechanism. If a match is found, then the L3 page table entry will be set to point to the existing copy of the data page. If no match is found, a new block is allocated on the back-end to host the new page and its hash saved for comparisons of future writes. To prevent any hash collision, Thin Deduplication leverages the ASICs to perform a bit-to-bit comparison before any new write update is marked as a duplicate.


All of the above operations occur inline after the write has been acknowledged back to the host but prior the data is flushed to the back-end, thus not impacting host latency.


Thin Deduplication implements an online garbage collection process to reclaim space that is no longer referenced. This runs continuously and is completely automated and transparent. Unlike some competitive implementation Thin Deduplication does not restrict the number of times a given page can be referenced.




Why HP 3PAR 7450 “All-Flash Array” versus an “Amateur Flash Array”?

There are more than 30 flash companies in the industry today, all with some level of interesting technology and niche implementation. HP 3PAR StoreServ is taking flash mainstream, by combining low cost of flash storage from a $/GB usable/raw perspective with a solution that satisfies all the existing use cases that are important in Primary Storage.

  • Primary storage is all about protecting the data in any possible way, while offering integrity and availability and assuring the best performance with optimized efficiency with data compaction, rack density, power/cooling and ease of use. This is what HP 3PAR StoreServ Architecture is all about. A proliferation of flash companies brings “architectures” and new ways to spin features or functionalities:
  • Architectures that claim to be built from the ground up for flash and interestingly only support certain types of flash from certain vendors/suppliers and fixed drive sizes. All complimented with a lot of DRAM cache with a side of NVRAM.
  • Architectures where scale-out equals replacing controllers (Replace-Out), where there is documentation but no implementation (Fake-Out) or where you just aggregate the manageability layer (Knock-Out).
  • Architectures that say RAID is bad, but actually still use a RAID-based implementation and concepts. (For those who wonder, 3PAR introduced its RAID alternative in 2009, HP 3PAR RAID MP, multiple parities with multiple mirrors wide striped across multiple enclosures.)
  • Architectures with guaranteed performance and some with 100% performance claims while actually just using 50% of available performance. And even others with the best metadata handling, all in cache or not, shared or distributed, yeah…hard to keep up.

There are some interesting choices and some good technology out there, but only one Flash-Optimized Architecture that meets the criteria for Modern Tier-1 Storage that is ready for the Next Style of IT.


Read up on flash


Edison Competitive Review

Check out the Edison assessment of efficiency technologies with an overview of other solutions from the competition.


ESG White Paper on HP 3PAR 7450

Read Mark Peters’ assessment on HP 3PAR approach to flash


BrightTALK Flash Vendor Panel

Watch this BrightTALK vendor panel discussion that happened last week between HP, EMC, Pure Storage and Kaminario. It’s a fair and interesting conversation.

0 Kudos
About the Author


Our team of Hewlett Packard Enterprise storage experts helps you to dive deep into relevant infrastructure topics.


Can you talk more about the RAID MP ? My understanding (as a 3PAR customer since '06) that the RAID MP was simply an extension of RAID 5 with another parity bit stored on another disk. At the time of introduction 3PAR specifically told me that RAID 6/MP was little more than a "check box" for some customers who had stupid requirements of a platform that "must" support RAID 6 even though there was no need(at the time) on 3PAR (I'd argue the need isn't there still on most 3PAR systems other than ones that are pretty small).


The magic was already there in the chunklet-based RAID going back to the origins of 3PAR.


Though your post above seems to imply there is somethin special in RAID MP.


A couple articles I wrote years ago

  • 81,000 RAID arrays on one of my 3PAR boxes at the time (and I include the sample script so other customers can get their number if they are curious how many arrays are running on their systems)
  • Do you really need RAID 6? - explaining some of the bits behind why RAID on 3PAR is so different/better. Also shows an example on how RAID 6 failed horribly for one user, and how there is already triple parity RAID out there on some platforms (perhaps quadruple by now that post is four years old)

Myself am in the process of quoting out a 7450 with these new SSDs, likely a 4-node right off the bat, to replace/augment an aging F-class running a 93% write workload. It will be a nice upgrade, even before the de-dupe stuff. Will be putting that 5 year warranty on the SSDs to the test.


I would like to see HP give customers the ability to cluster nodes in 74xx vertically instead of horizontally though - I mentioned this directly at tech day last year. It would provide the ability to survive a failure of the shelf that contains the two controllers and stay online. Given the 74xx cross system interconnect is now cable based instead of hard wired it should not be difficult. I suppose with the 6 nines guarantee program that the liklihood of a "shelf" that has the controller pair failing completely is probably quite rare... but still would like to see the option Smiley Happy




Nice feature! There seems to be an typo in the first image.. it says 460TB raw and 1.2PB usable.. it should be the other way isn't?


Gabriel - Absolutely correct. Remember that deduplication increases useable capacity so useable capacity we quoted is based on 4:1 dedupe ratio.


Nate - I'm sitting in a Turkish restaurant in Cologne Germany and will leave it to the team to respond to you. Long way from Vegas. 

Hi Nate,

thanks for your comment and the interest in the article. RAID MP is actually more than that and as a matter of fact it's the other way around, since 3.1.2 RAID 5 is just RAID MP with only a parity. We just continued to use industry standard terms because that is what the majority of customers are used to and we believe it's important for customer to know and understand RAIDs overheads up front. RAID MP allows us to integrate further with the background pd scrubbing process to detect any potential issue at the device block layer and trigger raid rebuilds down to 512byte granularity if necessary ,versus having to do read and rebuild at the step-size level. This is particularly important when it comes to flash as it allows us to be pro-active in detecting and correcting any CRC/bit errors. Regarding the need of using or not single parity protection versus double parity we are still leaving this decision to customers, data shows that with flash MTBF are greater and rebuilds faster. We are working on a few new things and I will talk about them in a blog later this year.

Regarding the cluster interconnect that was something we were actually limited by how HDD worked and the fact you could only have active/passive connectivity to a device. Flash does not have this and other limitations and it's great to have an architecture that can now exploit this new media. And you are correct the occurrence of the failure is beyond the six nines boundaries from both a model perspective (how the architecture is designed) and observation perspective (how the systems perform in production).


So... This is dedupe... but its only on the 7450? Not for conventional disk based appliances?



Hi Alex,


HP 3PAR StoreServ 7450 is an All Flash model, therefore deduplication on HDD is not applicable.



I find the conversation of thin deduplication and compaction quite amusing.  HP has to be the only vendor that can start talking about having deduplication and compaction (compression) since June and not deliver anything.  How about you innovate and quite lying to the public about these features.  There is NO deduplication or compaction as of Oct 27th and I don't think it is appropriate to mislead customers/prospects.


@AllMarketing - when we announced Thin Deduplication, we were clear that it was coming.  And customers were absolutely aware.  That said, Thin Deduplication has been available to customers for a while, first in beta and now in general release.  In fact I just saw an email from the field reporting that a customer who moved data from HDDs to Thin Dedupe SSDs saw 14.6:1 compaction. 


HP 3PAR Thin Deduplication is here.

I have some questions:

  • There are HP's plans for Deduplication in SATA / SAS disks?
  • This deduplication can be scheduled or only works on inline?






Hi Santiago -

Plans for spinning disk is a roadmap discussion and we don't publicly talk roadmaps. The team is looking at it but that's about all I can say there.


Why would you want to schedule deduplication?  I think that would only come up if you were using some type of deduplication that causes a performance bottleneck. With HP 3PAR, deduplication is done inline with the 3PAR ASIC - with almost no performance impact. I know others turn theirs off under heavy load because deduplication would create even more of a performance bottleneck but that isn't the case with HP 3PAR.

Jan 30-31, 2018
Expert Days - 2018
Visit this forum and get the schedules for online HPE Expert Days where you can talk to HPE product experts, R&D and support team members and get answ...
Read more
See posts for dates
HPE Webinars - 2018
Find out about this year's live broadcasts and on-demand webinars.
Read more
View all