HPE Storage Tech Insiders
cancel
Showing results for 
Search instead for 
Did you mean: 

Stop Caching & Compressing Your Backup Repository!

ndyer39

Hello all,

I thought i'd write a quick blog post based on a recent observation i've seen across my customers when performing Infosight analysis; the subject today is using Nimble Storage for Backup Repositories.

When creating a backup repository (from the likes of Veeam, Commvault, Symantec etc) there are few things to watch out for in order to ensure performance is kept the same for other workloads on the array.

Nowadays, host side backup tools as mentioned above have built in data reduction technologies such as deduplication and compression, which is performed on the proxy server before being sent to the repository.

However the trend i've seen recently is the repository is given a standard Application Policy on the Nimble array, or a policy which has caching and compression enabled. The problems that have arisen from this are:

  1. The backup repository is being placed into SSD cache - something which shouldn't be done as the backup data is rarely read back.
  2. The array is attempting to compress the backup repository as data is being written, however this data is already compressed at source, meaning CPU cycles are being burnt for no gain on the array itself.

Here's an example taken from Infosight (customer data removed). Four backup repositories are created (one for each country), yet each volume is being cached (one at 95%!).

Screen Shot 2015-03-04 at 11.51.132.png

This is because the volumes have been allocated a policy called "Veeam", which has compression AND caching enabled. Notice the Compression stat at 0.97x, which is 0% reduction.

Screen Shot 2015-03-04 at 11.51.36.png

This in turn started to burn CPU cycles and cache (notice the CPU and cache increase from the end of December onwards, when this was configured).

Screen Shot 2015-03-04 at 11.50.35.png

To rectify this issue, create a new policy - I created one called Backup-Repository. I kept it at 4KB, however I turned OFF compression (as it's being compressed at source) and caching (as I don't want to serve backup data through flash cache).

Screen Shot 2015-03-04 at 12.19.45.png

I can then change the Application Policy allocation on the fly for any already created volumes on the array, and any new data being written to the system will a) NOT be compressed and b) NOT be cached. Exactly what we want!

Screen Shot 2015-03-04 at 12.32.04.png

Hope this is useful to you all. We've also just released a new Best Practice Guide which focuses specifically on Veeam Backup & Replication, which is available to download here: http://www.nimblestorage.com/docs/downloads/nimblestorage-tp-veeam-backup-replication.pdf

0 Kudos
About the Author

ndyer39

Comments
PeterMu

Similarly would you recommend to move data to uncompressed volumes where the effect of nimble's compression is almost nothing? IE I have a esxi volume with stats as below:

Size 1.5 TB

Used 1.14 TB

Primary Compression 1.0X

Primary Space Saved 0 B

This esxi volume has just one large 2012r2 vm on it, big partition where data is stored is using windows deduplication which I think also does a measure of compression?

rbrowne40

Hi Peter,

If you are not seeing any compression benefits on the volume you can move the VM to a volume with compression disabled as you suggested - this may save some CPU cycles on the array. I don't believe Windows 2012 deduplication includes any kind of compression & I'm surprised that you don't get any compression benefits at the moment (we should at least see some compression from the c:/ for example) are the VM's drives encrypted in any way?

Rols

PeterMu

I have to wonder if maybe I am incurring a loss on the deduplicated partition (E:) and any gain from compression on the OS drive (C:) is wiped out. No the drives are not encrypted. Or maybe this is a bug?

ndyer39

No I wouldn't say this is a bug; Windows 2012 R2 supports host side compression as well as deduplication. The same rules apply; If you're compressing at the host or source, turn it off on the array (actually it would be better to turn it off at the host level if you could, but sometimes that's not possible).

PeterMu

after moving one VM to a volume with compression disabled the array indicates this new volume has 30GB more used space, so maybe there is a bug in reporting the compression stats?

rajatsharma0174

Hi Peter.

It is interesting observation that you ended in rather increased used space. Could you please verify what was the volume block size on both source and target volumes? Also how did you move VM image?

Rajat

PeterMu

Block size is 4096 on source and target volumes, the VM was moved via a storage vMotion

dtoews41

Hi all,

Just wanted to add that using Nimble's built in compression on Veeam repositories does have some advantages.  In our environment, I was able to reduce our backup job time significantly by moving our backup repository from HP P2000 storage to our Nimble CS300, disabling compression on the Veeam backup job and letting the array handle the compression instead.  I'm seeing similar compression rates (though TBH, the Nimble doesn't compress quite as well as the Veeam job did ... but close enough considering the massive difference in the time it takes to complete our backups). 

For example, it used to consistently take about 6-6.5hrs to backup and compress just one of our Exchange DAG members.  We used to have to have a second job to do the remaining Exchange servers and stagger the jobs so that we could complete them in a single night.  Now I have all of our Exchange servers (6 in total including the DAG member) plus our BES server in the same job and with Veeam compression turned off, it typically takes between 2-3 hrs for the entire run (often less depending on the amount of change).  In terms of compression, Veeam used to give me about 1.4x compression on the job, with the Nimble I'm seeing about 1.3x.  Considering the fact that it takes about 3-4hrs less time, that's completely acceptable as I can now run an additional job or two in the same night. 

I realize that the the CS300 is a faster array than the P2000, but I did a test with another job on the P2000.   Just disabling the compression (and keeping the repository on the P2000) still took 1 hr off a similar sized backup job.  So if you want to reduce your backup window without taking (much) of a capacity hit, allowing the Nimble to handle compression duties will definitely help.

rajatsharma0174

Hi Darren,

Great finding! There are few differences to note down here about Veeam and Nimble compression.

1. Veeam has to compress data while transferring so it comes directly in latency path, while within Nimble, we do not have to block for compression, so write latency is not compromised (of-course up-to an extent to cope up with load).

2. The fact that Veeam is getting higher compression ratio could be

     a. Either an attribute of higher block size used for compression, whereas Nimble's compression would depend on block size of the Volume.

     b. Or Veeam could be using a higher compression algorithm/level. Please note that even slight gain in compression ratio requires exponentially more amount of CPU cycles, hence slowing down the backup.

3. Exchange data-set is not very compressible, 1.3x or 1.4x is what we generally see. It's like squeezing out juice from a dry lemon J. The equation might change a bit for some other data type e.g. SQL.

Regards,

Rajat

rajatsharma0174

Perhaps moving just one VM does not tell much about compression ratio. Consider this calculation:

Compressed Volume Size = 1.14 TB = 1140 GB

Uncompressed Volume Size = 1140GB + 30GB = 1170 GB

Compression Ratio = 1170/1140 = 1.026 ~= 1.0x

Also even if the data is compressed by application/filesystem, there can be a low % of partially filled blocks e.g. towards the end which can be getting compressed by storage array. However their % is low, so it might not show up as compression savings for the volume.

Events
See posts for dates
See posts for locations
HPE at 2018 Technology Events
Learn about the technology events where Hewlett Packard Enterprise will have a presence in 2018.
Read more
See posts for dates
Online
HPE Webinars - 2018
Find out about this year's live broadcasts and on-demand webinars.
Read more
View all