Quickly Recovering Replication on Hyper-V

Two weeks ago, I had to recover from a sizable power outage. When this happened, my first priority was to make sure that all of my virtual machines were running well. Once I had done this, my next goal was to get Hyper-V Replica back up and running – so that I would be protected against any future problems.

Now, Hyper-V Replica would have eventually sorted itself out – but I did not want to wait for this to happen organically. I wanted things fixed immediately.

Hyper-V Replica had correctly detected that was a problem, and had scheduled resynchronization for all of my virtual machines. What I did to speed up the process was to shut down all non-critical virtual machines, and then use PowerShell to run the following command:

Get-VM -ComputerName Hyper-V-1, Hyper-V-2 | ?{$_.ReplicationMode -eq “Primary” -and $_.ReplicationHealth -eq “Critical”} | Resume-VMReplication -Resynchronize

This caused replica resynchronization to start immediately for all virtual machines that were reporting that replication was in a critical state. At this stage I must give a word of caution. You may be wandering why I shut down non-critical virtual machines before doing this. The reason is that initiating a mass resynchronization like this will generate a huge amount of disk activity, as Hyper-V goes through and rechecks all of the data on disk. I shut down non-critical systems to try and minimize the amount of data churn that occurred during this process.  Even with this precautionary step, I could feel the system slow down overall while resynchronization was happening.

But after a relatively short period of time, resynchronization was complete and my computers were (almost) back to normal.

Cheers,
Ben

Upcoming Preview of 'Disaster Recovery to Azure' Functionality in Hyper-V Recovery Manager

In the coming weeks, we will Preview functionality within Hyper-V Recovery Manager to enable Microsoft Azure as a Disaster Recovery point for virtualized workloads. The new functionality will add support for secure and seamless management of failover and failback operations using Azure IaaS Virtual Machines, thereby enabling our customers to save precious CAPEX and ongoing OPEX incurred in managing a secondary site for Disaster Recovery. Our enhanced DRaaS offering further delivers on our promise of democratizing Disaster Recovery and of making it available to everyone, everywhere. Hyper-V Recovery Manager provides enterprise-scale Disaster Recovery using a sing-click failover in the event of a disaster to an alternate enterprise data center or to an IaaS VM in Microsoft Azure. Application and Site Level Disaster Recovery is delivered via automation of overall DR workflow, smart networking, and frequent testing using DR Drills.

 

We announced the Preview during TechEd 2014. For more details about the upcoming Preview and existing Hyper-V Recovery Manager functionality, check out the DCIM-B322 session recording.

Replication Health Mailer

One of our Engineers, Sangeeth, has come up with a nifty PowerShell script which mails the replication health in a host or  in a cluster in a nice dashboard format. We thought it would be of help to our customers to get the status of the replicating VMs and their foot print on CPU and in Memory. You can download the script here.

The sample output from the script looks like this. You can add as many recipients as you wish Smile

Capture

On a cluster, you can run this script on one of the cluster nodes to get information about all Cluster VMs. You can even run this script to get information from remote host and remote Cluster using “HostorClusterName” parameter. In case of cluster use “isCluster” parameter to tell the script to get information from Cluster rather than on the local node.

Isn’t it simple and easy to get the replication information about VMs?

Hyper-V Replica trouble-shooting wiki

We are happy to announce the availability of Hyper-V Replica trouble-shooting Wiki here:

http://social.technet.microsoft.com/wiki/contents/articles/21948.hyper-v-replica-troubleshooting-guide.aspx

This guide contains links and resources to trouble-shoot some common Hyper-V Replica failure scenarios. We will be updating the guide over time!

We would like this to be a community effort to make it social and you are free to add the content to this guide.To add content follow this high level schema for the new articles [Please feel free to add other sections as appropriate]:

a. Error Messages/Event Viewer details – This section mentions what error messages customer will see on UI/PS/WMI and what event viewer messages are logged.

b. Possible Causes – This section explains list of scenarios(one or more) which might have led to failure.

c. Resolution – This section lists down the actions admin has to take in his environment to resolve the failure

d. Additional resources – List of blogs/KB Articles/Documentation/other articles which contain more information for the customer about the failure.

If you are new to TechNet wiki, the guide on “How to contribute” is here.

Happy WIKI’ing Smile

Excluding virtual disks in Hyper-V Replica

Since its introduction in Windows Server 2012, Hyper-V Replica has provided a way for users to exclude specific virtual disks from being replicated. This option is rarely exercised but can have a significant benefits when used correctly. This blog post covers the disk exclusion scenarios and the impact this has on the various operations done during the lifecycle of VM replication. This blog post has been co-authored by Priyank Gaharwar of the Hyper-V Replica test team.

Why exclude disks?

Excluding disks from replication is done because:

  1. The data churned on the excluded disk is not important or doesn’t need to be replicated    (and)
  2. Storage and network resources can be saved by not replicating this churn

Point #1 is worth elaborating on a little. What data isn’t “important”? The lens used to judge the importance of replicated data is its usefulness at the time of Failover. Data that is not replicated should also not be needed at the time of failover. Lack of this data would then also not impact the Recovery Point Objective (RPO) in any material way.

There are some specific examples of data churn that can be easily identified and are great candidates for exclusion – for example, page file writes. Depending on the workload and the storage subsystem, the page file can register a significant amount churn. However, replicating this data from the primary site to the replica site would be resource intensive and yet completely worthless. Thus the replication of a VM with a single virtual disk having both the OS and the page file can be optimized by:

  1. Splitting the single virtual disk into two virtual disks – one with the OS, and one with the page file
  2. Excluding the page file disk from replication

How to exclude disks

Application impact – isolating the churn to a separate disk

The first step in using this feature is to first isolate the superfluous churn on to a separate virtual disk, similar to what is described above for page files. This is a change to the virtual machine and to the guest. Depending on how your VM is configured and what kind of disk you are adding (IDE, SCSI) you may have to power off your VM before any changes can be made.

At the end, an additional disk should surface up in the guest. Appropriate configuration changes should be done in the application to change the location of the temporary files to point to the newly added disk.

Figure 1:  Changing the location of the System Page File to another disk/volumeimage

Excluding disks in the Hyper-V Replica UI

Right-click on a VM and select “Enable Replication…”. This will bring up the wizard that walks you through the various inputs required to enable replication on the VM. The screen titled “Choose Replication VHDs” is where you deselect the virtual disks that you do not want to replicate. By default, all virtual disks will be selected for replication.

Figure 2:  Excluding the page file virtual disk from a virtual machineimage

Excluding disks using PowerShell

The Enable-VMReplication commandlet provides two optional parameters: –ExcludedVhd and –ExcludedVhdPath. These parameters should be used to exclude the virtual disks at the time of enabling replication.

PS C:Windowssystem32> Enable-VMReplication -VMName SQLSERVER -ReplicaServerName repserv01.contoso.com -AuthenticationType Kerberos -ReplicaServerPort 80 -ExcludedVhdPath 'D:Primary-SiteHyper-VVirtual Hard DisksSQL-PageFile.vhdx'

After running this command, you will be able to see the excluded disks under VM Settings > Replication > Replication VHDs.

Figure 3:  List of disks included for and excluded from replication image

Impact of disk exclusion

Enable replication A placeholder disk (for use during initial replication) is not created on the Replica VM. The excluded disk doesn’t exist on the replica in any form.
Initial replication The data from the excluded disks are not transferred to the replica site.
Delta replication The churn on any of the excluded disks is not transferred to the replica site.
Failover The failover is initiated without the disk that has been excluded. Applications that refer to the disk/volume in the guest will have their configurations incorrect.

For page files specifically, if the page file disk is not attached to the VM before VM boot up then the page file location is automatically shifted to the OS disk.

Resynchronization The excluded disk is not part of the resynchronization process.

Ensuring a successful failover

Most applications have configurable settings that make use of file system paths. In order to run correctly, the application expects these paths to be present. The key to a successful failover and an error-free application startup is to ensure that the configured paths are present where they should be. In the case of file system paths associated with the excluded disk, this means updating the Replica VM by adding a disk – along with any subfolders that need to be present for the application to work correctly.

The prerequisites for doing this correctly are:

  • The disk should be added to the Replica VM before the VM is started. This can be done at any time after initial replication completes, but is preferably done immediately after the VM has failed over.
  • The disk should be added to the Replica VM with the exact controller type, controller number, and controller location as the disk has on the primary.

There are two ways of making a virtual disk available for use at the time of failover:

  1. Copy the excluded disk manually (once) from the primary site to the replica site
  2. Create a new disk, and format it appropriately (with any folders if required)

When possible, option #2 is preferred over option #1 because of the resources saved from not having to copy the disk. The following PowerShell script can be used to green-light option #2, focusing on meeting the prerequisites to ensure that the Replica VM is exactly the same as the primary VM from a virtual disk perspective:

param (

    [string]$VMNAME,

    [string]$PRIMARYSERVER

)

 

## Get VHD details from primary, replica

$excludedDisks = Get-VMReplication -VMName $VMNAME -ComputerName $PRIMARYSERVER | select ExcludedDisks

$includedDisks = Get-VMReplication -VMName $VMNAME | select ReplicatedDisks

if( $excludedDisks -eq $null ) {

    exit

}

 

#Get location of first replica VM disk

$replicaPath = $includedDisks.ReplicatedDisks[0].Path | Split-Path -Parent

 

## Create and attach each excluded disk

foreach( $exDisk in $excludedDisks.ExcludedDisks )

{

    #Get the actual disk object

    $pDisk = Get-VHD -Path $exDisk.Path -ComputerName $PRIMARYSERVER

    $pDisk

    

    #Create a new VHD on the Replica

    $diskpath = $replicaPath + "" + ($pDisk.Path | Split-Path -Leaf)

    $newvhd = New-VHD -Path $diskpath `

                      -SizeBytes $pDisk.Size `

                      -Dynamic `

                      -LogicalSectorSizeBytes $pDisk.LogicalSectorSize `

                      -PhysicalSectorSizeBytes $pDisk.PhysicalSectorSize `

                      -BlockSizeBytes $pDisk.BlockSize `

                      -Verbose

    if($newvhd -eq $null) 

    {

        Write-Host "It is assumed that the VHD [" ($pDisk.Path | Split-Path -Leaf) "] already exists and has been added to the Replica VM [" $VMNAME "]"

        continue;

    }

 

    #Mount and format the new new VHD

    $newvhd | Mount-VHD -PassThru -verbose `

            | Initialize-Disk -Passthru -verbose `

            | New-Partition -AssignDriveLetter -UseMaximumSize -Verbose `

            | Format-Volume -FileSystem NTFS -Confirm:$false -Force -verbose `

    

    #Unmount the disk 

    $newvhd | Dismount-VHD -Passthru -Verbose

 

    #Attach disk to Replica VM

    Add-VMHardDiskDrive -VMName $VMNAME `

                        -ControllerType $exDisk.ControllerType `

                        -ControllerNumber $exDisk.ControllerNumber `

                        -ControllerLocation $exDisk.ControllerLocation `

                        -Path $newvhd.Path `

                        -Verbose

}

The script can also be customized for use with Azure Hyper-V Recovery Manager, but we’ll save that for another post!

Capacity Planner and disk exclusion

The Capacity Planner for Hyper-V Replica allows you to forecast your resource needs. It allows you to be more precise about the replication inputs that impact the resource consumption – such as the disks that will be replicated and the disks that will not be replicated.

Figure 4:  Disks excluded for capacity planningimage

Key Takeaways

  1. Excluding virtual disks from replication can save on storage, IOPS, and network resources used during replication
  2. At the time of failover, ensure that the excluded virtual disk is attached to the Replica VM
  3. In most cases, the excluded virtual disk can be recreated on the Replica side using the PowerShell script provided

See you at TechEd North America!

Here’s a quick intro to the Hyper-V people attending TechEd North America in Houston next week.  I’m also posting booth times for each of us.

The booth’s official name is: Datacenter & Infrastructure Management: Cloud & Datacenter Infrastructure Solutions

It will be at the center of the Expo Hall.

 

*Note* 

These are the times we have to be at a booth.  Chances are we’ll be there beyond these hours.

 

Sam Chandrashekar

Wednesday                        3:00 PM – 6:00 PM

Thursday                             10:45 AM – 12:45 PM

 

Patrick Lang

Wednesday                        3:00 PM – 6:00 PM

Thursday                             10:45 AM – 12:45 PM

 

Taylor Brown

Monday                               10:15 AM – 12:15 PM

 

Sarah Cooley — I’m on loan to the Windows team. Come talk to me next door at the Windows, Phone, & Devices block: Mobility

Monday                               5:45 PM-8:30 PM

Tuesday                               10:45 AM-12:30 PM

                                                2:15 PM-4:00 PM

Wednesday                        10:45 AM-1:00 PM

                                                3:00 PM-6:00 PM

Thursday                             10:45 AM-12:45 PM

 

Ben Armstrong

Monday                               10:15 AM-12:15 PM

Tuesday                               10:45 AM-12:30 PM

Wednesday                        12:45 PM-3:15 PM

Thursday                             10:45 AM-12:45 PM

 

Jeff Woolsey – running around

 

Again, feel free to stop by and talk to us. Looking forward to seeing you at TechEd.

 


[From left to right: Taylor, Sam, Ben, Sarah, Jeff, Patrick]

 

Cheers,

Sarah