Around the Storage Block
cancel
Showing results for 
Search instead for 
Did you mean: 

Setup for SRM 8.3 and HPE Nimble Storage vVols

In part 1 of this blog series, we saw how to setup vSphere 7.0 at two sites, with 2 HPE Nimble Storage arrays for a disaster recovery with SRM.  In this part 2 article, we will setup SRM.

VMware SRM is a business continuity and disaster recovery solution that helps plan, test, and run the recovery of VMs between a protected vCenter site and a recovery vCenter site.  SRM documentation has been now updated for version 8.3 and includes support for vVols. From the new product blog, SRM 8.3 is fully integrated with vVols using array-based replication. As this integration utilizes vVol replication and the VASA 3.0 protocol specification, it supports a protection granularity as small as a single VM. Multiple VMs can also be replicated together in a replication group that is maintained as a consistency group.

VASA 3.0 vVol replication concepts

The vSphere 6.5 release introduced a set of functions to replicate and recover VM data at the granularity of vVols. Instead of using SRAs, new functions for vVol replication use VASA to orchestrate replication and recovery workflows. In array based replication, storage arrays and SRAs orchestrate protection and failover. In vVol based replication, the unit of replication is the virtual volume, coordinated by VASA providers.

A Replication Group (RG) is a group of replicated storage devices as defined by a storage administrator (perhaps acting on behalf of an application owner) to provide atomic failover for an application. In other words, an RG is the minimum unit of failover. In vSphere 6.5, vVol RGs are created and managed by a storage administrator using the storage vendor’s tools. For HPE Nimble Storage, RGs are automatically created by the Vasa Provider (VP) on the array. These correspond to volume collections on the array, and are replicated together with schedules, to a replication partner.

A Fault Domain (FD) abstracts the notion that a set of resources must fail together. Examples include a rack, an array, a storage processor, or a network switch. For HPE Nimble Storage arrays, a Nimble “Group” is a Fault Domain: a collection of up to 4 Nimble arrays.

Associated with every Replication Group at the recovery site is a set of Point in Time (PiT) replicas. Each replica has an associated timestamp, which corresponds to a PiT that can be used to choose a particular replica for recovery. Such replicas may be created implicitly (automatically) as a result of a periodic trigger mechanism (e.g. every 15 minutes) or as an explicit operation (create a replica before patching the OS).  For HPE Nimble storage arrays, creation of a PiT results in a snapshot collection on the array’s volume collection corresponding to the specific RG.

The vSphere 6.5 release discovers the Fault Domain hierarchy but leaves the management of Fault Domains to the replicators. Replication Groups are also discovered by vSphere, not managed by the VMware platform. Instead, administrators/arrays are expected to create, update, or delete these Replication Group – this is part of how storage administrators will set up replication.

VASA 3.0 is prescriptive about the operational target of actions.  For example, vSphere executes “synchronize replication group” before starting a test failover. In VASA 3.0, “synchronize replication group” is always called on the target fault domain.

The VASA 3.0 specification supports a new workflow to test and then promote to production, and also proposes to support continuous data protection (CDP).

The following restrictions apply:

  • Replication of a VM with disks in multiple storage containers is not a supported use case.
  • A single VM cannot span multiple Replication Groups. However, not all of a VM’s disks must be replicated, although its config-vVol must be replicated for successful recovery.

 

Workflow Synopsis

Replication Discovery: vVol disaster recovery discovers the current replication relationships between two fault domains. The list of vVols that are replicated to a specific replication peer are discovered.

Test Failover: To ensure that the recovered workloads will be functional after a failover, administrators periodically run the test-failover workflow.

Disaster Recovery and Planned Migration: For planned migration, on-demand sync can be initiated at the recovery site. This is followed by failover.

Setting up protection after a DR event: After the recovery on a peer site, administrators can set protection in the reverse direction

Now that the theory is laid out, let’s get practical and learn how these elements are installed and configured.

SRM install and configure

We recommend using the SRM VA on both the protected and recovery site. Follow the documentation from VMware’s website. For those of you familiar with using Nimble Storage with SRM for VMFS datastores and using the SRA, please note that vVol VMs DO NOT need the SRA for DR. Follow the SRM Install and configure docs.

After deploying the SRM VA on both sites, the SRM plug-in appears in the vSphere client.

Figure 34: SRM plugin in vCenterFigure 34: SRM plugin in vCenter

 

Login to the SRM AMI (Appliance Management Interface) at https://<ip-or-fqdn-of-SRM-va>:5480 with user admin, and the password specified during the VA deployment. Go to “Time” --> Time Synchronization --> Edit --> change “Mode” to NTP and add your time server here.

Figure 35: Set NTP for SRMFigure 35: Set NTP for SRM

 

Next, connect the SRM instances on the protected and recovery sites. This is known as site pairing. In the vSphere web client, click Site Recovery --> Open Site Recovery --> Click on New Site Pair and follow the prompts to associate the PSCs from the two sites.. Then select the vCenter server, and the services you want to pair.
Figure 37: New site pairFigure 37: New site pair

 

Figure 38: Second site PSCFigure 38: Second site PSC

 

Figure 39: Second site vCenter and SRMFigure 39: Second site vCenter and SRM

 

Figure 40: Ready to complete pairingFigure 40: Ready to complete pairing

 

Figure 41: Site Pair configuredFigure 41: Site Pair configured

 

Confirm the protected and recovery sites are connected, under Site Pairs on the SRM tab.

SRM Administration

Now we come to the administration part of SRM. Be sure to refer to the latest SRM Admin Guide as you need. vVols supports replication, test recovery, test cleanup, planned migration, disaster recovery and reprotect. With the array-based replication, you can off-load replication of VMs to your storage array and use full replication capabilities of the array. You can group several VMs to replicate them as a single unit.

vVols replication is policy driven. After you configure your vVols storage for replication, information about replication capabilities and replication groups is delivered from the array by the storage provider. This information shows in the VM Storage Policy interface of vCenter Server.

You use the VM storage policy to describe replication requirements for your VMs. The parameters that you specify in the storage policy depend on how your array implements replication. For example, your VM storage policy might include such parameters as the replication schedule, replication frequency, or recovery point objective (RPO). The policy might also indicate the replication target, a secondary site where your VMs are replicated, or specify whether replicas must be deleted.

For HPE Nimble Storage, the VM Storage policy can define all these requirements:

  • Deduplication: yes or no
  • Application policy: decides your volumes’ block size and space domain
  • Performance: limits on IOPS and/or MiB/sec per volume
  • Data encryption: encrypt the volumes at rest
  • Protection: schedules of snapshots minutely, hourly, daily, weekly.

 

By assigning the replication policy during VM provisioning, you request replication services for your VM. After that, the array takes over the management of all replication schedules and processes.

Configure Mappings

 

Let’s go ahead and configure mappings in SRM. Mappings allow you to specify how SRM maps virtual machine resources on the protected site to resources on the recovery site. 

Navigate to the SRM instance at site A, go to Configure --> Network Mappings, Folder Mappings, Resource Mappings, and Storage Policy Mappings. Pick and choose resources from site A and site B that will be used for restored VMs and reprotect. When you make these changes in the SRM on site A, you can check the SRM on site B will reflect these settings too.

For the network mapping, you can set a different “Test Network” for recovery. An isolated network can be auto-created. Creating a test network allows the test to run without potentially disrupting VMs in the production environment.

Figure 42: Storage policy mappingsFigure 42: Storage policy mappings

 

Figure 43: Recovery storage policiesFigure 43: Recovery storage policies

 

Figure 44: Reverse mappingsFigure 44: Reverse mappings

 

You must configure a placeholder datastore on both sites in the pair to establish bidirectional protection and to perform reprotect. For Nimble Storage vVols replication, do not use vVols datastore as a placeholder datastore. Instead, use a VMFS datastore as the placeholder datastore.

Figure 45: Placeholder datastoresFigure 45: Placeholder datastores

 

Create Protection Groups

 

Now we can create Protection Groups (PGs). This is a collection of VMs that SRM protects together. For vVols, we will simply select the Fault Domains (FDs) which happen to be our arrays, and then all the Replication Groups (RGs) under the FDs to be part of the PG.  One or more PGs forms a recovery plan (RP). An RP specifies how SRM recovers the VMs in the PGs that it contains. If you did not previously configure inventory mappings, you can configure those for each VM in a PG.

After you create a vVols PG, SRM creates placeholder VMs on the recovery site and applies inventory mappings to each VM in the group. If SRM cannot map a VM to a folder, network or resource pool on the recovery site, SRM sets the VM to the “Mapping Missing” status and does not create a placeholder VM for it.

Do not create PGs that combine VMs that use array-based replication, vVols replication or storage policy based replication.

When using vVols protection groups, SRM checks both the recovery and the protection site and matches the vVols configurations that can be used. That includes paired fault domains, direction of replication, and so on. 

There are certain limitations to vVols protection groups:

  • SRM does not support protection of VMs that have non-replicated virtual disks with vVols protection groups.
  • SRM does not support the protection of VMs with different vVols-based disks, replicated by different storage policies or different vVols replication groups.
  • vVols does not support the recovery of template VMs.

Go to SRM --> Protection Groups --> New and follow the prompts to create a new PG.  These set of steps is performed at site A, the protected site.

Figure 46: New Protection GroupFigure 46: New Protection Group

 

Choose Type as "Virtual Volumes (vVol replication)"

Figure 47: Choose vVol replication type for PGFigure 47: Choose vVol replication type for PG

 

Select replication groups to be part of this PG, check the checkbox. Clicking on the small arrow shows the current status of all VMs on the RG.

Figure 48: Select replication groups to be part of this PGFigure 48: Select replication groups to be part of this PG

 

In the next step, choose an existing recovery plan, or create a new one. A protection group can be a part of many recovery plans.

Figure 49: Create a new recovery planFigure 49: Create a new recovery plan

 

Confirm all settings for the new Protection Group to be created.

Figure 50: Summary of new Protection Group to be createdFigure 50: Summary of new Protection Group to be created

 

After you click “Finish”, a set of tasks are kicked off on the vCenter on site B. New placeholder VMs are created on the recovery site B.  You can also confirm the state of the new Protection Group and Replication Group.

Figure 51: site B VMs and tasksFigure 51: site B VMs and tasks

 

Figure 52: Newly created Protection GroupFigure 52: Newly created Protection Group

 

Watch this space next week for the final part of this 3-part series on SRM and vVols!

 

HPE Nimble Storage
About the Author

mamatadesaiNim

VMware and Nimble Storage QA engineer