Around the Storage Block
cancel
Showing results for 
Search instead for 
Did you mean: 

What is HPE 3PAR Adaptive Data Reduction?

HPEStorageGuy

 

As part of the huge announcement we did earlier today we announced 3PAR Adaptive Data Reduction. I'll give you a high-level look at what it is and I have a podcast with Phill Gilbert (3PAR product management team) that goes deeper than I could myself.  

Adaptive Data Reduction is an umbrella term

3PAR Adaptive Data Reduction is an umbrella term that contains four data reduction 3PAR Compression_blog.jpgtechnologies:

  • Zero Detect - removes zeros inline. This is something that 3PAR has had for a long time.
  • Deduplication - removes storing duplicate data. We announced 3PAR hardware-accelearated dedupe back in late 2014. We made some updates to how deduplication works and we talk about it in the podcast.
  • Compression - reduces the data footprint. It's selectable on a per-volume basis.  This is new for 3PAR and is only supported with the 8000 and 20000 families. We talk about this in the podcast. 
  • Data Packing - packs multiple pages to increase data reduction efficiency. Again, we really dive into this in the podcast but I'll briefly describe it below.  

When used together, the Adaptive Data Reduction technologies operate in this order, starting with Zero Detect down through Data Packing. 

Looking at what's new: Compression

I'll whet your appetite to dive deeper in the podcast with some basic information on what's new with 3PAR OS 3.3.1. I'll start with compression. We first look to see if data is compressable; if it's not, we don't compress it. We use the LZ4 compression algorithm. It's very fast for compression and really fast for decompression. It also offers excellent savings. From our testing, we're seeing an average compression of 2:1 for structured data (e.g. databases).  Here are a few other points:

  • Compression is available at no cost to existing customers
  • Requires the 3PAR Gen5 ASIC (8000/20000 families)
  • Supports pinning a virtual volume to SSD
  • Compression can't be combined with Adaptive Optimization

Check out the podcast to learn more. 

Let's get Data Packing

One of the issues that the "designed from the ground up for flash" vendors have had is inefficiency and post-processing garbage collection. Generally, existing all-flash vendors have adopted one of two approached to writing compressed data to SSD.Writing compressed data challenges.PNG

  • Option 1 is variable block size.  What happens is that after pages get deduped and than compressed – they can be of varying sizes.  That’s true for 3PAR as well as everyone else.  Now, the other guys will then write these inconsistent page sizes to SSDs. So if they dedupe and compress a page to 8.3KB, they have to write a 16KB to the SSD.  Generally, writes will be 4, 8, or 16KB size. So vendors that use variable block size end up with inefficient writes.
  • Option 2 is append-only data structures.  With this approach, the array will combine these variable length writes into a single stripe written with any required metadata written sequentially to SSD.  When hosts overwrite data, the old blocks are invalidated and new data is written to a new stripe. This creates a "swiss cheese" effect with blocks of data that need to be cleaned up via garbage collection and housekeeping.  The system needs to 'hide' space for this housekeeping and IO intensive to keep up with the massive amounts of garbage. 

3PAR Data Packing solves the inefficiency and performance compromises of these two approaches. We pack the compressed data into a 16KB write. There are several benefits to the 3PAR approach:

  • Instead of appending data or only writing new data, 3PAR can update in place.  I asked Phill about this on the podcast to get the details there.
  • Maintain data locality because we only back data from the same virtual volume and this increases the likelihood that they'll be modified together.
  • No compromise approach that packs pages together with virtually no wasted space, now or in the future.
  • Garbage-free means there is minimal to no garbage collection needed.
  • Storage Class Memory-ready: Data Packing is an architectural design, created to deliver the benefits of Storage Class Memory. 

I've had a couple of questions on Twitter asking about Data Packing on the 3PAR 7000 family.  Remember that we don't support compression on the 7000. When you have dedupe on, we remove entire pages. When we write to the backend SSDs, we write fewer pages, but they're always complete flash-native pages. When you compress data, that's when you end up with odd-sized pages. Those pages are the hard ones to deal with and that's when we pack them together to get flash-native pages.

Since the 7000 doesn't support compression, Data Packing is not needed and isn't supported.

Here's the YouTube version of the podcast. 

You can also download the audio only podcast by right-clicking here and save the file.   My podcast is hosted on Talkshoe.com. It’s also on iTunes.  I had a lull in my podcasts about a year ago but I've brought it back and there are a lot of recent ones worth listening to so be sure to subscribe to it. 

About the Author

CJZ Headshot fixed 150 x 150.jpg

I'm Calvin Zito, a 34 year veteran in the IT industry and have worked in storage for over 25 years. I am a six-time VMware vExpert. As an early adopter of social media and active in communities, I've blogged for 9 years. You can find me on Twitter as @CalvinZito. If you don't follow me on Twitter, do it now! You can also contact me via email

0 Kudos
About the Author

HPEStorageGuy

I have worked at HP and now HPE since 1983, all of it around storage but 100% focused on storage since 1990. I blog, create videos, and podcasts to help you better understand HPE Storage.

Events
Read for dates
HPE Webinars - 2019
Find out about this year's live broadcasts and on-demand webinars.
Read more
Read for dates
HPE at 2019 Technology Events
Learn about the technology events where Hewlett Packard Enterprise will have a presence in 2019.
Read more
View all