Around the Storage Block

What’s so great about StoreOnce federated deduplication?

AshwinJ.jpgBy Ashwin Shetty, Product Marketing, HPE Storage


It has been more than two years since we announced federated deduplication.This technology addresses many of the shortcomings of other deduplication solutions—such as incompatible deduplication algorithms implemented in software and hardware, comparatively slower restore performance versus backup performance, and inefficient methods of scaling deduplication to meet ever expanding capacity requirements. All of these challenges result in increasing risk, additional costs, and significant management overheads for the user. So, let’s take a closer look at HPE StoreOnce deduplication and the benefits it brings.



Check out our HP StoreOnce portfolio


Federated deduplication

StoreOnce federated deduplication is a technology developed in HP Labs, which can be deployed across the entire storage infrastructure. It provides deployment independence enabling movement of data across various HPE systems without rehydrating the data. Federated deduplication allows data reduction to occur in HPE Data Protector Software, in HPE StoreOnce Backup appliances, in a virtual machine, or as a standalone deduplication API on a server.


Optimized data deduplication

HPE StoreOnce deduplication technology includes several innovations from HP Labs:

  • It adopts an approach to eliminate maximum amount of data redundancy along with maintaining a small index to deliver the fastest performance.
  • It converts the backup data stream into a series of chunks and implements them in a variable chunking deduplication algorithm. Along with this algorithm, StoreOnce employs a chunk size of 4K – the smallest in the industry. This provides the ability to match data better, since smaller data chunk sizes lead to a higher percentage of “data matches” resulting in higher deduplication ratios. The 4k-chunk size allows for better alignment with mixed workloads. Most of the vendors in the industry use chunk sizes of 8k, 16k or 32k.
  • It deploys locality sampling and sparse indexing to significantly lower IO and RAM requirements without compromising on performance.
  • It deploys intelligent data matching techniques that enables optimal performance for multiple data types and backup software deployments with no added complexity.

For restoring data, HPE’s approach involves having large-container technology with superior disk layouts. A high degree of fragmentation is avoided by not replacing small amount of deduplicate data with pointers to faraway places with no related data. Data is defragmented after deduplication. This results in faster restoration of data as reconstituting data does not require many slow random seeks.


StoreOnce Catalyst

StoreOnce Catalyst is a key component in our federated deduplication architecture that enables deduplication anywhere, rather than specific points in the network allowed by the vendor’s technology. It leverages a common algorithm across the enterprise and allows deduplication at the:

  • Production source or “client”
  • Backup or media server
  • Target appliance

Catalyst provides a single technology that can be used in multiple locations on the network without requiring rehydration when data is transferred between source server backup device and target appliance.


Data replication

HP StoreOnce deduplication enables network efficient offsite data replication. All HPE StoreOnce Backup systems use StoreOnce federated deduplication to significantly reduce the amount of data that needs to be replicated, enabling the use of lower bandwidth, lower cost links to transmit data offsite. StoreOnce enabled replication allows cost-effective centralized backup from remote sites or branch offices, and delivers a consolidated disaster recovery solution for the data center.


Multiple StoreOnce appliances and virtual machines can replicate to a central StoreOnce appliancewith a fan-in of up to 384 remote offices to a single HP StoreOnce 6500 target delivering greater economies of scale for disaster recovery. HPE StoreOnce VSA can be deployed at:

  • Remote offices: Your remote offices can deploy local backup/recovery solutions using your existing IT infrastructure without any dedicated backup appliance. You can set up the backup server and storage appliance to be delivered virtually. The deduplicated data within the VSA can be replicated from the remote offices to a larger physical HP StoreOnce appliance at your data center.
  • Cloud providers: Subscribers who do not have a deduplication solution can now utilize HPE StoreOnce VSA for high-speed local recovery (with your existing backup software), and then replicate the data via HPE federated deduplication to the service provider’s HPE StoreOnce repository. As a cloud provider, you also have the option of providing individual StoreOnce VSA per subscriber. Each subscriber will have a completely autonomous deduplication appliance within the cloud without the need to replace backup software.

Meeting big data requirements

By combining federated deduplication with our unique scale-out cluster architecture, we deliver the industry’s only large-scale deduplication appliance with fully automated high resiliency features such as high availability (dual controllers) and autonomic restart of failed backup jobs. Our solution is designed to simplify and speed up Big Data backup with scalable capacity and performance.


All that—and a deduplication guarantee too

We are so confident about our federated deduplication technology that when companies move from legacy storage backup system to any HP StoreOnce Backup solution, we are willing to offer a 20:1 deduplication guarantee thanks to our HP StoreOnce Get Protected Guarantee Program. This reduces the amount of stored backup data by 95% or HP will make up the difference with free disk capacity and support.


About the Author


Our team of Hewlett Packard Enterprise storage experts helps you to dive deep into relevant infrastructure topics.