Category Archives: Hyper-V

How to Run a Windows Failover Cluster Validation Test

Guest clustering describes an increasingly popular deployment configuration for Windows Server Failover Clusters where the entire infrastructure is virtualized. With a traditional cluster, the hosts are physical servers and run virtual machines (VMs) as their highly available workloads. With a guest cluster, the hosts are also VMs which form a virtual cluster, and they run additional virtual machines nested within them as their highly available workloads. Microsoft now recommends dedicating clusters for each class of enterprise workload, such as your Exchange Server, SQL Server, File Server, etc., because each application has different cluster settings and configuration requirements. Setting up additional clusters became expensive for organizations when they had to purchase and maintain more physical hardware. Other businesses wanted guest clustering as a cheaper test, demo or training infrastructure. To address this challenge, Microsoft Hyper-V supports “nested virtualization” which allows you to create virtualized hosts and run VMs from them, creating fully-virtualized clusters. While this solves the hardware problem, it has created new obstacles for backup providers as each type of guest cluster has special considerations.

Hyper-V Guest Cluster Configuration and Storage

Let’s first review the basic configuration and storage requirements for a guest cluster. Fundamentally a guest cluster has the same requirements as a physical cluster, including two or more hosts (nodes), a highly available workload or VM, redundant networks, and shared storage. The entire solution must also pass the built-in cluster validation tests. You should also force every virtualized cluster node to run on different physicals hosts so that if a single server fails, it will not bring down your entire guest cluster. This can be easily configured using Failover Clustering’s AntiAffinityClassNames or Azure Availability Sets, so in the event that you lose that physical server, the entire cluster will not fail. Some of the guest cluster requirements will also vary on the nested virtualized application which you are running, so always check for workload-specific requirements during your planning.

Shared storage used to be a requirement for all clusters because it allows the workload or VM to access the same data regardless of which node is running that workload. When the workload fails over to a different node, its services get restarted, then it accesses the same shared data which it was previously using. Windows Server 2012 R2 and later supports guest clusters with shared storage using a shared VHDX disk, iSCSI or virtual fibre channel. Microsoft added support for local DAS replication using storage spaces direct (S2D) within Windows Server 2016 and continued to improve S2D with the latest 2019 release.

For a guest cluster deployment guide, you can refer to the documentation provided by Microsoft to create a guest cluster using Hyper-V. If you want to do this in Microsoft Azure, then you can also follow enabling nested virtualization within Microsoft Azure.

Backup and Restore the Entire Hyper-V Guest Cluster

The easiest backup solution for guest clustering is to save the entire environment by protecting all the VMs in that set. This has almost-universal support by third party backup vendors such as Altaro, as it is essentially just protecting traditional virtual machines which have a relationship to each other. If you are using another VM as part of the set as an isolated domain controller, iSCSI target or file share witness, make sure it is backup up too.

A (guest) cluster-wide backup is also the easiest solution for scenarios where you wish to clone or redeploy an entire cluster for test, demo or training purposes by restoring it from a backup. If you are restoring a domain controller, make sure you bring this back online first. Note that if you are deploying copies of a VM, especially if one contains a domain controller, that any images have been Sysprepped to avoid conflicts by giving them new global identifiers. Also, use DHCP to get new IP addresses for all network interfaces. In this scenario, it is usually much easier to just deploy this cloned infrastructure in a full isolated environment so that the cloned domain controllers do not cause conflicts.

The downside to cluster-wide backup and restore is that you will lack the granularity to protect and recover a single workload (or item) running within the VM, which is why most admins will select another backup solution for guest clusters. Before you pick one of the alternative options, make sure that both your storage and backup vendor support this guest clustering configuration.

Backup and Restore a Guest Cluster using iSCSI or Virtual Fibre Channel

When guest clusters first became supported for Hyper-V, the most popular storage configurations were to use an iSCSI target or virtual fibre channel. iSCSI was popular because it was entirely Ethernet-based, which means that inexpensive commodity hardware could be used and Microsoft offered a free iSCSI Target server. Virtual fiber channel was also prevalent since it was the first type of SAN-based storage supported by Hyper-V guest clusters through its virtualized HBAs. Either solution works fine and most backup vendors support Hyper-V VMs running on these shared storage arrays. This is a perfectly acceptable solution for reliable backups and recovery if you are deploying a stable guest cluster. The main challenge was that in its earlier versions, Cluster Shared Volumes (CSV) disks and live migration had limited support by vendors. This meant that basic backups would work, but there were a lot of scenarios that would cause backups to fail, such as when a VM was live migrating between hosts. Most scenarios are supported in production, yet still make sure that your storage and backup vendors support and recommend it.

Backup and Restore a Guest Cluster using a Shared Virtual Hard Disk (VHDX) & VHD Set

Windows Server 2012 R2 introduced a new type of shared storage disk which was optimized for guest clustering scenarios, known as the shared virtual hard disk (.vhdx file), or Shared VHDX. This allowed multiple VMs to synchronously access a single data file which represented a shared disk (similar to a drive shared by an iSCSI Target). This disk could be used as a file share witness disk, or more commonly to store shared application data used by the workload running on the guest cluster. This Shared VHDX file could either be stored on a CSV disk or SMB file share (using a Scale-Out File Server).

This first release of a shared virtual hard disk had some limitations and was generally not recommended for production. The main criticisms were that backups were not reliable, and backup vendors were still catching up to support this new format. Windows Server 2016 addressed these issues by adding support for online resizing, Hyper-V Replica, and application-consistent checkpoints. These enhancements were released as a newer Hyper-V VHD Set (.vhds) file format. The VHD Set included additional file metadata which allowed each node to have a consistent view of that shared drive’s metadata, such as the block size and structure. Prior to this, nodes might have an inconsistent view of the Shared VHDX file structure which could cause backups to fail.

While VHD Sets was optimized to support guest clusters, there were inevitably some issues discovered which are documented by Microsoft Support. An important thing when using Shared VHDX / VHD Sets for your guest cluster is that all of your storage, virtualization, and clustering components are patched with any related hotfixes specific to your environment, including any from your storage and backup provider. Also, make sure you explicitly check that your ISVs support this updated file format and follow Microsoft’s best practices. Today this is the recommended deployment configuration for most new guest clusters.

Backup and Restore a Guest Cluster using Storage Spaces Direct (S2D)

Microsoft introduced another storage management technology in Windows Server 2016, which was improved in Windows Server 2019, known as Storage Spaces Direct (S2D). S2D was designed as a low-cost solution to support clusters without any requirement for shared storage. Instead, local DAS drives are synchronously replicated between cluster nodes to maintain a consistent state. This is certainly the easiest guest clustering solution to configure, however, Microsoft has announced some limitations in the current release (this link also includes a helpful video showing how to deploy a S2D cluster in Azure).

First, you are restricted to a 2-node or 3-node cluster only, and in either case you can only sustain the loss or outage of a single node. You also want to ensure that the disks have low latency and high performance, ideally using SSD drives or Azure’s Premium Storage managed disks. One of the major limitations still remains around backups as host-level virtual disk backups are currently not supported. If you deploy the S2D cluster, you are restricted to only taking backups from within the guest OS. Until this has been resolved and your backup vendor supports S2D, the safest option with the most flexibility will be to deploy a guest clustering using Shared VHDX / VHD Sets.

Summary

Microsoft is striving to improve guest clustering with each subsequent release. Unfortunately, this makes it challenging for third-party vendors to keep up with their support of the latest technology. It can be especially frustrating to admins when their preferred backup vendor has not yet added support for the latest version of Windows, and you should share this feedback on what you need with your ISVs. It is always a good best practice to select a vendor with close ties to Microsoft, as they get provided with early access to code and always aim to support the latest and greatest technology. The leading backup companies like Altaro are staffed by Microsoft MVPs and regularly consult with former Microsoft engineers such as myself, to support the newest technologies as quickly as possible. But always make sure that you do your homework before you deploy any of these guest clusters so you can pick the best configuration which is supported by your backup and storage provider.


Go to Original Article
Author: Symon Perriman

How to Install Hyper-V PowerShell Module

The best combination of power and simplicity for controlling Hyper-V is its PowerShell module. The module’s installable component is distinct from the Hyper-V role, and the two are not automatically installed together. Even if you have installed the free Hyper-V Server product that ships with the Hyper-V role already enabled, you’ll still need to install the PowerShell module separately. This short guide will explain how to install that module and understand its basic structure. If you need to use directly control Windows Server 2012/R2 or Hyper-V Server 2012/R2 using the PowerShell module as it ships in Windows 10/Windows Server 2016 or 2019, instructions are at the very end of this post.

How to Install the Hyper-V PowerShell Module with PowerShell

The quickest way to install the module is through PowerShell. There are several ways to do that, depending on your operating system and your goal.

Using PowerShell to Install the Hyper-V PowerShell Module in Windows 10

All of the PowerShell cmdlets for turning off and on Windows features and roles are overlays for the Deployment Image Servicing and Management (DISM) subsystem. Windows 10 does include a PowerShell module for DISM, but it uses a smaller cmdlet set than what you’ll find on your servers. The server module’s cmdlets are simpler, so I’m going to separate out the more involved cmdlets into the Windows 10 section. The cmdlets that I’m about to show you will work just as well on a server operating system as on Windows 10, although the exact names of the features that you’ll use might be somewhat different. All cmdlets must be run from an elevated PowerShell prompt.

As I mentioned in the preamble to this section, there a few different ways that you can enable the Hyper-V PowerShell module. There is only a single cmdlet, and you will only need to use it to enable a single feature. However, the module appears in a few different features, so you’ll need to pick the one that is most appropriate to you. You can see all of the available options like this:

The reason that you see so many different objects is that it’s showing a flat display of the hierarchical tree that you’d get if you opened the Windows Features window instead. Unfortunately, this cmdlet does not have a nicely formatted display (even if you don’t pare it down with any filters), so it might not be obvious. Compare the output of the cmdlet to the Windows Features screen:

You have three options if you want to install the PowerShell Module on Windows 10. The simplest is to install only the module by using its feature name. Installing either of the two options above it (Hyper-V Management Tools or the entire Hyper-V section) will include the module. I trimmed off the feature name in the images above, so all three possibilities are shown below:

Tab completion will work for everything except the specific feature name. But, don’t forget that copy/paste works perfectly well in a PowerShell window (click/drag to highlight, [Enter] to copy, right-click to paste). You can use the output from Get-WindowsOptionalFeature so that you don’t need to type any feature names.

It’s fine to install a higher-level item even if some of its subcomponents are already installed. For example, if you enabled the Hyper-V platform but not the tools, you can enable Microsoft-Hyper-V-All and it will not hurt anything.

The  Enable-WindowsOptionalFeature cmdlet does not have a ComputerName parameter, but it can be used in explicit PowerShell Remoting.

Using PowerShell to Install the Hyper-V PowerShell Module in Windows Server or Hyper-V Server 2012, 2016 & 2019

The DISM PowerShell tools on the server platforms are a bit cleaner to use than in Windows 10. If you’d like, the cmdlets shown in the Windows 10 section will work just as well on the servers (already the feature names are different). The cmdlets shown in this section will only work on server SKUs. They must be run from an elevated prompt.

Ordinarily, I don’t show cmdlets using positional parameters, but I wanted you to see how easy this cmdlet is to use. The full version of the shown cmdlet is Get-WindowsFeature -Name *hyper-v*. Its output looks much nicer than Get-WindowsOptionalFeature:

There is a difference, though. Under Windows 10, all the items live under the same root. In the Windows SKUs, Hyper-V is under the Roles branch but all of the tools are under the Features branch. The output indentation, when filtered, is misleading.

The Server SKUs have an Install-WindowsFeature cmdlet. Its behavior is similar to Enable-WindowsOptionalFeature, but it is not quite the same. Enabling the root Hyper-V feature will not automatically select all of the tools (as you might have already found out). These are all of the possible ways to install the Hyper-V PowerShell Module using PowerShell on a Server product:

Tab completion will work for everything except the specific feature name. But, don’t forget that copy/paste works perfectly well in a PowerShell window (click/drag to highlight, [Enter] to copy, right-click to paste). You can use the output from Get-WindowsFeature so that you don’t need to type any feature names.

If the Hyper-V role is already enabled, you can still use either of the last two options safely. If the Hyper-V role is not installed and you are using one of those options, the system will need to be restarted. If you like, you can include the -Restart parameter and DISM will automatically reboot the system as soon as the installation is complete.

The Install-WindowsFeature cmdlet does have a ComputerName parameter, so it can be used with implicit PowerShell Remoting to enable the feature on multiple computers simultaneously. For example, use -ComputerName svhv01, svhv02, svhv03, svhv04 to install the feature(s) on all four of the named hosts simultaneously. If you are running your PowerShell session from a Windows 10 machine that doesn’t have that cmdlet, you can still use explicit PowerShell Remoting.

How to Install the Hyper-V PowerShell Module Using the GUI

It seems a bit sacrilegious to install a PowerShell module using a GUI, and it certainly takes longer than using PowerShell, but I suppose someone has a reason.

Using the GUI to Install the Hyper-V PowerShell Module on Windows 10

Follow these steps in Windows 10:

  1. Right-click on the Start button and click Programs and Features.
  2. In the Programs and Features dialog, click Turn Windows features on or off
  3. In the Windows Features dialog, check the box for Hyper-V Module for Windows PowerShell (and anything else that you’d like) and click OK.
  4. The dialog will signal completion and the module will be installed.

Using Server Manager to Install the Hyper-V PowerShell Module on Windows Server or Hyper-V Server 2012, 2016 & 2019

Server Manager is the tool to use for graphically adding roles and features on Windows Server and Hyper-V Server systems. Of course, you’re not going to be able to directly open Server Manager on Hyper-V Server systems, but you can add a system running Hyper-V Server to the console of any same-level system running a GUI edition of Windows Server (security restrictions apply). The RSAT package for Windows 10 includes Server Manager and can remotely control servers (security restrictions apply there, as well). While Server Manager can be remotely connected to multiple systems, it can only install features on one host at a time.

To use Server Manager to enable Hyper-V’s PowerShell module, open the Add Roles and Features wizard and proceed through to the Features page. Navigate to Remote Server Administration Tools -> Role Administration Tools -> Hyper-V Management Tools and check Hyper-V Module for Windows PowerShell. Proceed through the wizard to complete the installation.

The module will be immediately available to use once the wizard completes.

A Brief Explanation of the Hyper-V PowerShell Module

Once installed, you can find the module’s files at C:WindowsSystem32WindowsPowerShellv1.0ModulesHyper-V. Its location will ensure that the module is automatically loaded every time PowerShell starts up. That means that you don’t need to use Import-Module — you can start right away with your scripting.

If you browse through and look at the files a bit, you might notice that the PowerShell module files reference a .DLL. This means that this particular PowerShell module is a binary module. Microsoft wrote it in a .Net language and compiled it. Its cmdlets will run faster than they would in a text-based module, but you won’t be able to see how it does its work (at least, not by using any sanctioned methods).

Connecting to Windows/Hyper-V Server 2012, 2016 & 2019 from PowerShell in Windows 10/Server 2016 & 2019

If you are using Windows 10 and Windows/Hyper-V Server 2016 or 2019, there’s an all-new version 2.0.0 of the Hyper-V PowerShell module. That’s a good thing, right? Well, usually. The thing is, the 2012 and 2012 R2 versions aren’t going away any time soon, and we still need to control those. Version 2 of the PowerShell module will throw an error when you attempt to control these down-level systems. The good news is that you can work around this limitation fairly easily. If you browsed the folder tree on one of these newer OS releases, you may have noticed that there is a 1.1 folder as well as a 2.0.0 folder. The earlier binary module is still included!

So, does that mean that you can happily kick off some scripts on those “old” machines? Let’s see:

The error is: “Get-VM : The Hyper-V module used in this Windows PowerShell session cannot be used for remote management of the server ‘SVHV2’. Load a compatible version of the Hyper-V module, or use Powershell remoting to connect directly to the remote server. For more information, see http://go.microsoft.com/fwlink/p/?LinkID=532650.

What to do?

The answer lies in a new feature of PowerShell 5, which fortunately comes with these newer OSs. We will first get a look at what our options are:

You could run this without ListAvailable to determine which, if any, version is already loaded. You already know that PowerShell auto-loads the module and, if you didn’t already know, I’m now informing you that it will always load the highest available version unless instructed otherwise. So, let’s use the new RequiredVersion parameter to instruct it otherwise:

The results of this operation:

Is this good? Well, it’s OK, but not great. Popping a module in and out isn’t the worst thing in the world, but can you imagine scripting that to work against hosts of various levels? While possible, the experience would certainly be unpleasant. If you’re going to interactively control some down-level Hyper-V hosts, this approach would work well enough. For serious scripting work, I’d stick with the WMI/CIM cmdlets and explicit remoting.

If you have any questions about using the Hyper-V PowerShell module including installation, optimization or anything else, let me know in the comments below and I’ll help you out!

This blog was originally published on July 2017 but has been updated with corrections and new content to be relevant from March 2020.

Go to Original Article
Author: Eric Siron

The Complete Guide to Scale-Out File Server for Hyper-V

This article will help you understand how to plan, configure and optimize your SOFS infrastructure, primarily focused on Hyper-V scenarios.

Over the past decade, it seems that an increasing number of components are recommended when building a highly-available Hyper-V infrastructure. I remember my first day as a program manager at Microsoft when I was tasked with building my first Windows Server 2008 Failover Cluster. All I had to do was connect the hardware, configure shared storage, and pass Cluster Validation, which was fairly straightforward.

Failover Cluster with Traditional Cluster Disks

Figure 1 – A Failover Cluster with Traditional Cluster Disks

Nowadays, the recommend cluster configuration for Hyper-V virtual machines (VMs) requires adding additional management layers such as Cluster Shared Volumes (CSV), disks which must also cluster a file server to host the file path to access it, known as a Scale-Out File Server (SOFS). While the SOFS provides the fairly basic functionality of keeping a file share online, understanding this configuration can be challenging for experienced Windows Server administrators. To see the complete stack which Microsoft recommends, scroll down to see the figures throughout this article. This may appear daunting, but do not worry, we’ll explain what all of these building blocks are for.

While there are management tools like System Center Virtual Machine Manager (SCVMM) that can automate the entire infrastructure deployment, most organizations need to configure these components independently. There is limited content online explaining how Scale-Out File Server clusters work and best practices for optimizing them. Let’s get into it!

Scale-Out File Server (SOFS) Capabilities & Limitations

A SOFS cluster should only be used for specific scenarios. The following list of features have been tested and are either supported, supported but not recommended, or not supported with the SOFS.

Supported SOFS scenarios

  • File Server
    • Deduplication – VDI Only
    • DFS Namespace (DFSN) – Folder Target Server Only
    • File System
    • SMB
      • Multichannel
      • Direct
      • Continuous Availability
      • Transparent Failover
  • Other Roles
    • Hyper-V
    • IIS Web Server
    • Remote Desktop (RDS) – User Profile Disks Only
    • SQL Server
  • System Center Virtual Machine Manager (VMM)

Supported, but not recommended SOFS scenarios

  • File Server
    • Folder Redirection
    • Home Directories
    • Offline Files
    • Roaming User Profiles

Unsupported SOFS scenarios

  • File Server
    • BranchCache
    • Deduplication – General Purpose
    • DFS Namespace (DFSN) – Root Server
    • DFS Replication (DFSR)
    • Dynamic Access Control (DAC)
    • File Server Resource Manager (FSRM)
    • File Classification Infrastructure (FCI)
    • Network File System (NFS)
    • Work Folders

Scale-Out File Server (SOFS) Benefits

Fundamentally, a Scale-Out File Server is a Failover Cluster running the File Server role. It keeps the file share path (\ClusterStorageVolume1) continually available so that it can always be accessed. This is critical because Hyper-V VMs us this file path to access their virtual hard disks (VHDs) via the SMB3 protocol. If this file path is unavailable, then the VMs cannot access their VHD and cannot operate.

Additionally, it also provides the following benefits:

  • Deploy Multiple VMs on a Single Disk – SOFS allows multiple VMs running on different nodes to use the same CSV disk to access their VHDs.
  • Active / Active File Connections – All cluster nodes will host the SMB namespace so that a VM can connect or quickly reconnect to any active server and have access to its CSV disk.
  • Automatic Load Balancing of SOFS Clients – Since multiple VMs may be using the same CSV disk, the cluster will automatically distribute the connections. Clients are able to connect to the disk through any cluster node, so they are sent to the server with fewest file share connections. By distributing the clients across different nodes, the network traffic and its processing overhead are spread out across the hardware which should maximize its performance and reduce bottlenecks.
  • Increased Storage Traffic Bandwidth – Using SOFS, the VMs will be spread across multiple nodes. This also means that the disk traffic will be distributed across multiple connections which maximizes the storage traffic throughput.
  • Anti-Affinity – If you are hosting similar roles on a cluster, such as two active/active file shares for a SOFS, these should be distributed across different hosts. Using the cluster’s anti-affinity property, these two roles will always try to run on different hosts eliminating a single point of failure.
  • CSV Cache – SOFS files which are frequently accessed will be copied locally on each cluster node in a cache. This is helpful if the same type of VM file is read many times, such as in VDI scenarios.
  • CSV CHKDSK – CSV disks have been optimized to skipping the offline phase, which means that they will come online faster after a crash. Faster recovery time is important for high-availability since it minimizes downtime.

Scale-Out File Server (SOFS) Cluster Architecture

This section will explain the design fundaments of Scale-Out File Servers for Hyper-V. The SOFS can run on the same cluster as the Hyper-V VMs it is supporting, or on an independent cluster. If you are running everything on a single cluster, the SOFS must be deployed as a File Server role directly on the cluster; it cannot run inside a clustered VM since that VM won’t start without access to the File Server. This would cause a problem since neither the VM nor the virtualized File Server could start-up since they have a dependency on each other.

Hyper-V Storage and Failover Clustering

When Hyper-V was first introduced with Windows Server 2008 Failover Clustering, it had several limitations that have since been addressed. The main challenge was that each VM required its own cluster disk, which made the management of cluster storage complicated. Large clusters could require dozens or hundreds of disks, one for each virtual machine. This was sometimes not even possible due to limitations created by hardware vendors which required a unique drive letter for each disk. Technically you could run multiple VMs on the same cluster disk, each with their own virtual hard disks (VHDs). However, this configuration was not recommended, because if one VM crashed and had to failover to a different node, it would force all the VMs using that disk to shut down and failover to other nodes. This causes unplanned downtime, and as virtualization becomes more popular, a cluster-aware file system was created known as Cluster Shared Volumes (CSV). See Figure 1 (above) for the basic architecture of a cluster using traditional cluster disks.

Cluster Shared Volume (CSV) Disks and Failover Clustering

CSV Disks were introduced in Windows Server 2008 R2 as a distributed file system that is optimized for Hyper-V VMs. The disk must be visible by all cluster nodes, use NTFS or ReFS, and can be created from pools of disks using Storage Spaces.

The CSV disk is designed to host VHDs from multiple VMs from different nodes and run them simultaneously. The VMs can distribute themselves across the cluster nodes, balancing the hardware resources which they are consuming. A cluster can host multiple CSV disks and their VMs can freely move around the cluster, without any planned downtime. The CSV disk traffic communicates over standard networks using SMB, so traffic can be routed across different cluster communication paths for additional resiliency, without being restricted to use a SAN.

A Cluster Shared Volumes disk functions similar to a file share hosting the VHD file since it provides storage and controls access. Virtual machines can access their VHDs like clients would access a file hosted in a file share using a path like \ClusterStorageVolume1. This file path is identical on every cluster node, so as a VM moves between servers it will always be able to access its disk using the same file path. Figure 2 shows a Failover Cluster storing its VHDs on a CSV disk. Note that multiple VHDs for different VMs on different nodes can reside on the same disk which they access through the SMB Share.

A Failover Cluster with a Cluster Shared Volumes (CSV) Disk

Figure 2 – A Failover Cluster with a Cluster Shared Volumes (CSV) Disk

Scale-Out File Server (SOFS) and Failover Clustering

The SMB file share used for the CSV disk must be hosted by a Windows Server File Server. However, the file share should also be highly-available so that it does not become a single point of failure. A clustered File Server can be deployed as a SOFS through Failover Cluster Manager as described at the end of this article.

The SOFS will publish the VHD’s file share location (known as the “CSV Namespace”) on every node. This active/active configuration allows clients to be able to access their storage through multiple pathways. This provides additional resiliency and availability because if one node crashes, the VM will temporarily pause its transactions until it can quickly reconnect to the disk via another active node, but it remains online.

Since the SOFS runs on a standard Windows Server Failover Cluster, it must follow the hardware guidance provided by Microsoft. One of the fundamental rules of failover clustering is that all the hardware and software should be identical. This allows a VM or file server to be able to operate the same way on any cluster node, as all the setting, file paths, and registry settings will be the same. Make sure you run the Cluster Validation tests and follow Altaro’s Cluster Validation troubleshooting guidance if you see any warnings or errors.

The following figure shows a SOFS deployed in the same cluster. The clustered SMB shares create a highly-available CSV namespace allowing VMs to access their disk through multiple file paths.

A Failover Cluster using Clustered SMB File Shares for CSV Disk Access

Figure 3 – A Failover Cluster using Clustered SMB File Shares for CSV Disk Access

Storage Spaces Direct (S2D) with SOFS

Storage Spaces Direct (S2D) lets organizations deploy small failover clusters with no shared storage. S2D will generally use commodity servers with direct-attached storage (DAS) to create clusters that use mirroring to replicate their data between local disks to keep their states consistent. These S2D clusters can be deployed as Hyper-V hosts, storage hosts or in a converged configuration running both roles. The storage uses Scale-Out File Servers to host the shares for the VHD files.

In Figure 4, a SOFS cluster is shown which uses storage spaces direct, rather than shared storage, to host the CSV volumes and VHD files. Each CSV volume and its respective VHDs are mirrored between each of the local storage arrays.

 A Failover Cluster with Storage Spaces Direct (S2D)

Figure 4 – A Failover Cluster with Storage Spaces Direct (S2D)

Infrastructure Scale-Out File Server (SOFS)

Windows Server 2019 introduced a new Scale-Out File Server role called the Infrastructure File Server. This functions as the traditional SOFS, but it is specifically designed to only support Hyper-V virtual infrastructure with no other types of roles. There can also be only one Infrastructure SOFS per cluster.

The Infrastructure SOFS can be created manually via PowerShell or automatically when it is deployed by Windows Azure Stack or System Center Virtual Machine Manager (SCVMM). This role will automatically create a CSV namespace share using the syntax \InfraSOFSNameVolume1. Additionally, it will enable the Continuous Availability (CA) setting for the SMB shares, also known as SMB Transparent Failover.

Infrastructure File Server Role on a Windows Server 2019 Failover Cluster

Figure 5 – Infrastructure File Server Role on a Windows Server 2019 Failover Cluster

Cluster Sets

Windows Server 2019 Failover Clustering introduced the management concept of cluster sets. A cluster set is a collection of failover cluster which can be managed as a single logical entity. It allows VMs to seamlessly move between clusters which then lets organizations create a highly-available infrastructure with almost limitless capacity. To simplify the management of the cluster sets, a single namespace can be used to access the cluster. This namespace can run on a SOFS for continual availability and clients will automatically get redirected to the appropriate location within the cluster set.

The following figure shows two Failover Clusters within a cluster set, both of which are using a SOFS. Additionally, a third independent SOFS is deployed to provide highly-available access to the cluster set itself.

A Scale-Out File Server with Cluster Sets

Figure 6 – A Scale-Out File Server with Cluster Sets

Guest Clustering with SOFS

Acquiring dedicated physical hardware is not required for the SOFS as this can be fully-virtualized. When a cluster runs inside of VMs instead of physical hardware, this is known as guest clustering. However, you should not run a SOFS within a VM which it is providing the namespace for, as it can get into a situation where it cannot start the VM since it cannot access the VM’s own VHD.

Microsoft Azure with SOFS

Microsoft Azure allows you to deploy virtualized guest clusters in the public cloud. You will need at least 2 storage accounts, each with a matching number and size of disks. It is recommended to use at least DS-series VMs with premium storage. Since this cluster is already running in Azure, it can also use a cloud witness for is quorum disk.

You can even download an Azure VM template which comes as a pre-configure two-node Windows Server 2016 Storage Spaces Direct (S2D) Scale-Out File Server (SOFS) cluster.

System Center Virtual Machine Manager (VMM) with SOFS

Since the Scale-Out File Server has become an important role in virtualized infrastructures, System Center Virtual Machine Manager (VMM) has tightly integrated it into their fabric management capabilities.

Deployment

VMM makes it fairly easy to deploy SOFS throughout your infrastructure on bare-metal or Hyper-V hosts. You can add existing file servers under management or deploy each SOFS throughout your fabric. For more information visit:

When VMM is used to create a cluster set, an Infrastructure SOFS is automatically created on the Management Server (if it does not already exist). This file share will host the single shared namespace used by the cluster set.

Configuration

Many of the foundational components of a Scale-Out File Server can be deployed and managed by VMM. This includes the ability to use physical disks to create storage pools that can host SOFS file shares. The SOFS file shares themselves can also be created through VMM. If you are also using Storage Spaces Direct (S2D) then you will need to create a disk witness which will use the SOFS to host the file share. Quality of Service (QoS) can also be adjusted to control network traffic speed to resources or VHDs running on the SOFS shares.

Management Cluster

In large virtualized environments, it is recommended to have a dedicated management cluster for System Center VMM. The virtualization management console, database, and services are highly-available so that they can continually monitor the environment. The management cluster can use unified storage namespace runs on a Scale-Out File Server, granting additional resiliency to accessing the storage and its clients.

Library Share

VMM uses a library to store files which may be deployed multiple times, such as VHDs or image files. The library uses an SMB file share as a common namespace to access those resources, which can be made highly-available using a SOFS. The data in the library itself cannot be stored on a SOFS, but rather on a traditional clustered file server.

Update Management

Cluster patch management is one of the most tedious tasks which administrators face as it is repetitive and time-consuming. VMM has automated this process through serially updating one node at a time while keeping the other workloads online. SOFS clusters can be automatically patched using VMM.

Rolling Upgrade

Rolling upgrades refers to the process where infrastructure servers are gradually updated to the latest version of Windows Server. Most of the infrastructure servers managed by VMM can be included in the rolling upgrade cycle which functions like the Update Management feature. Different nodes in the SOFS cluster are sequentially placed into maintenance mode (so the workloads are drained), updated, patched, tested and reconnected to the cluster. Workloads will gradually migrate to the newly installed nodes while the older nodes wait to be updated. Gradually all the SOFS cluster nodes are updated to the latest version of Windows Server.

Internet Information Services (IIS) Web Server with SOFS

Everything in this article so far has referenced SOFS in the context of being used for Hyper-V VMs. SOFS is gradually being adopted by other infrastructure services to provide high-availability to their critical components which use SMB file shares.

The Internet Information Services (IIS) Web Server is used for hosting websites. To distribute the network traffic, usually, multiple IIS Servers are deployed. If they have any shared configuration information or data, this can be stored in the Scale-Out File Server.

Remote Desktop Services (RDS) with SOFS

The Remote Desktop Services (RDS) role has a popular feature known as user profile disks (UPDs) which allows users to have a dedicated data disk stored on a file server. The file share path can be placed on a SOFS to make access to that share highly-available.

SQL Server with SOFS

Certain SQL Server roles have been able to use SOFS to make their SMB connections highly-available. Starting with SQL Server 2012, the SMB file server storage option is offered for SQL Server, databases (including Master, MSDB, Model and TempDB) and the database engine. The SQL Server itself can be standalone or deployed as a failover cluster installation (FCI).

Deploying a SOFS Cluster & Next Steps

Now that you understand the planning considerations, you are ready to deploy the SOFS. From Failover Cluster Manager, you will launch the High Availability Wizard and select the File Server role. Next, you will select the File Server Type. Traditional clustered file servers will use the File Server for general use. For SOFS, select Scale-Out File Server for application data.

The interface is shown in the following figure and described as, “Use this option to provide storage for server applications or virtual machines that leave files open for extended periods of time. Scale-Out File Server client connections are distributed across nodes in the cluster for better throughput. This option supports the SMB protocol. It does not support the NFS protocol, Data Deduplication, DSF Replication, or File Server Resource Manager.”

Installing a Scale-Out File Server (SOFS)

Figure 7 – Installing a Scale-Out File Server (SOFS)

Now you should have a fundamental understanding of the use and deployment options for the SOFS. For additional information about deploying a Scale-Out File Server (SOFS), please visit https://docs.microsoft.com/en-us/windows-server/failover-clustering/sofs-overview. If there’s anything you want to ask about SOFS, let me know in the comments below and I’ll get back to you!

Go to Original Article
Author: Symon Perriman

NTFS vs. ReFS – How to Decide Which to Use

By now, you’ve likely heard of Microsoft’s relatively recent file system “ReFS”. Introduced with Windows Server 2012, it seeks to exceed NTFS in stability and scalability. Since we typically store the VHDXs for multiple virtual machines in the same volume, it seems as though it pairs well with ReFS. Unfortunately, it did not… in the beginning. Microsoft has continued to improve ReFS in the intervening years. It has gained several features that distanced it from NTFS. With its maturation, should you start using it for Hyper-V? You have much to consider before making that determination.

What is ReFS?

The moniker “ReFS” means “resilient file system”. It includes built-in features to aid against data corruption. Microsoft’s docs site provides a detailed explanation of ReFS and its features. A brief recap:

  • Integrity streams: ReFS uses checksums to check for file corruption.
  • Automatic repair: When ReFS detects problems in a file, it will automatically enact corrective action.
  • Performance improvements: In a few particular conditions, ReFS provides performance benefits over NTFS.
  • Very large volume and file support: ReFS’s upper limits exceed NTFS’s without incurring the same performance hits.
  • Mirror-accelerated parity: Mirror-accelerated parity uses a lot of raw storage space, but it’s very fast and very resilient.
  • Integration with Storage Spaces: Many of ReFS’s features only work to their fullest in conjunction with Storage Spaces.

Before you get excited about some of the earlier points, I need to emphasize one thing: except for capacity limits, ReFS requires Storage Spaces in order to do its best work.

ReFS Benefits for Hyper-V

ReFS has features that accelerate some virtual machine activities.

  • Block cloning: By my reading, block cloning is essentially a form of de-duplication. But, it doesn’t operate as a file system filter or scanner. It doesn’t passively wait for arbitrary data writes or periodically scan the file system for duplicates. Something must actively invoke it against a specific file. Microsoft specifically indicates that it can greatly speed checkpoint merges.
  • Sparse VDL (valid data length): All file systems record the amount of space allocated to a file. ReFS uses VDL to indicate how much of that file has data. So, when you instruct Hyper-V to create a new fixed VHDX on ReFS, it can create the entire file in about the same amount of time as creating a dynamically-expanding VHDX. It will similarly benefit expansion operations on dynamically-expanding VHDXs.

Take a little bit of time to go over these features. Think through their total applications.

ReFS vs. NTFS for Hyper-V: Technical Comparison

With the general explanation out of the way, now you can make a better assessment of the differences. First, check the comparison tables on Microsoft’s ReFS overview page. For typical Hyper-V deployments, most of the differences mean very little. For instance, you probably don’t need quotas on your Hyper-V storage locations. Let’s make a table of our own, scoped more appropriately for Hyper-V:

  • ReFS wins: Really large storage locations and really large VHDXs
  • ReFS wins: Environments with excessively high incidences of created, checkpointed, or merged VHDXs
  • ReFS wins: Storage Space and Storage Spaces Direct deployments
  • NTFS wins: Single-volume deployments
  • NTFS wins (potentially): Mixed-purpose deployments

I think most of these things speak for themselves. The last two probably need a bit more explanation.

Single-Volume Deployments Require NTFS

In this context, I intend “single-volume deployment” to mean installations where you have Hyper-V (including its management operating system) and all VMs on the same volume. You cannot format a boot volume with ReFS, nor can you place a page file on ReFS. Such an installation also does not allow for Storage Spaces or Storage Spaces Direct, so it would miss out on most of ReFS’s capabilities anyway.

Mixed-Purpose Deployments Might Require NTFS

Some of us have the luck to deploy nothing but virtual machines on dedicated storage locations. Not everyone has that. If your Hyper-V storage volume also hosts files for other purposes, you might need to continue with NTFS. Go over the last table near the bottom of the overview page. It shows the properties that you can only find in NTFS. For standard file sharing scenarios, you lose quotas. You may have legacy applications that require NTFS’s extended properties, or short names. In these situations, only NTFS will do.

Note: If you have any alternative, do not use the same host to run non-Hyper-V roles alongside Hyper-V. Microsoft does not support mixing. Similarly, separate Hyper-V VMs onto volumes apart from volumes that hold other file types.

Unexpected ReFS Behavior

The official content goes to some lengths to describe the benefits of ReFS’s integrity streams. It uses checksums to detect file corruption. If it finds problems, it engages in corrective action. On a Storage Spaces volume that uses protective schemes, it has an opportunity to fix the problem. It does that with the volume online, providing a seamless experience. But, what happens when ReFS can’t correct the problem? That’s where you need to pay real attention.

On the overview page, the documentation uses exceptionally vague wording: “ReFS removes the corrupt data from the namespace”. The integrity streams page does worse: “If the attempt is unsuccessful, ReFS will return an error.” While researching this article, I was told of a more troubling activity: ReFS deletes files that it deems unfixable. The comment section at the bottom of that page includes a corroborating report. If you follow that comment thread through, you’ll find an entry from a Microsoft program manager that states:

ReFS deletes files in two scenarios:

  1. ReFS detects Metadata corruption AND there is no way to fix it. Meaning ReFS is not on a Storage Spaces redundant volume where it can fix the corrupted copy.
  2. ReFS detects data corruption AND Integrity Stream is enabled AND there is no way to fix it. Meaning if Integrity Stream is not enabled, the file will be accessible whether data is corrupted or not. If ReFS is running on a mirrored volume using Storage Spaces, the corrupted copy will be automatically fixed.

The upshot: If ReFS decides that a VHDX has sustained unrecoverable damage, it will delete it. It will not ask, nor will it give you any opportunity to try to salvage what you can. If ReFS isn’t backed by Storage Spaces’s redundancy, then it has no way to perform a repair. So, from one perspective, that makes ReFS on non-Storage Spaces look like a very high risk approach. But…

Mind Your Backups!

You should not overlook the severity of the previous section. However, you should not let it scare you away, either. I certainly understand that you might prefer a partially readable VHDX to a deleted one. To that end, you could simply disable integrity streams on your VMs’ files. I also have another suggestion.

Do not neglect your backups! If ReFS deletes a file, retrieve it from backup. If a VHDX goes corrupt on NTFS, retrieve it from backup. With ReFS, at least you know that you have a problem. With NTFS, problems can lurk much longer. No matter your configuration, the only thing you can depend on to protect your data is a solid backup solution.

When to Choose NTFS for Hyper-V

You now have enough information to make an informed decision. These conditions indicate a good condition for NTFS:

  • Configurations that do not use Storage Spaces, such as single-disk or manufacturer RAID. This alone does not make an airtight point; please read the “Mind Your Backups!” section above.
  • Single-volume systems (your host only has a C: volume)
  • Mixed-purpose systems (please reconfigure to separate roles)
  • Storage on hosts older than 2016 — ReFS was not as mature on previous versions. This alone is not an airtight point.
  • Your backup application vendor does not support ReFS
  • If you’re uncertain about ReFS

As time goes on, NTFS will lose favorability over ReFS in Hyper-V deployments. But, that does not mean that NTFS has reached its end. ReFS has staggeringly higher limits, but very few systems use more than a fraction of what NTFS can offer. ReFS does have impressive resilience features, but NTFS also has self-healing powers and you have access to RAID technologies to defend against data corruption.

Microsoft will continue to develop ReFS. They may eventually position it as NTFS’s successor. As of today, they have not done so. It doesn’t look like they’ll do it tomorrow, either. Do not feel pressured to move to ReFS ahead of your comfort level.

When to Choose ReFS for Hyper-V

Some situations make ReFS the clear choice for storing Hyper-V data:

  • Storage Spaces (and Storage Spaces Direct) environments
  • Extremely large volumes
  • Extremely large VHDXs

You might make an additional performance-based argument for ReFS in an environment with a very high churn of VHDX files. However, do not overestimate the impact of those performance enhancements. The most striking difference appears when you create fixed VHDXs. For all other operations, you need to upgrade your hardware to achieve meaningful improvement.

However, I do not want to gloss over the benefit of ReFS for very large volumes. If you have storage volume of a few terabytes and VHDXs of even a few hundred gigabytes, then ReFS will rarely beat NTFS significantly. When you start thinking in terms of hundreds of terabytes, NTFS will likely show bottlenecks. If you need to push higher, then ReFS becomes your only choice.

ReFS really shines when you combine it with Storage Spaces Direct. Its ability to automatically perform a non-disruptive online repair is truly impressive. On the one hand, the odds of disruptive data corruption on modern systems constitute a statistical anomaly. On the other, no one that has suffered through such an event really cares how unlikely it was.

ReFS vs NTFS on Hyper-V Guest File Systems

All of the above deals only with Hyper-V’s storage of virtual machines. What about ReFS in guest operating systems?

To answer that question, we need to go back to ReFS’s strengths. So far, we’ve only thought about it in terms of Hyper-V. Guests have their own conditions and needs. Let’s start by reviewing Microsoft’s ReFS overview. Specifically the following:

“Microsoft has developed NTFS specifically for general-purpose use with a wide range of configurations and workloads, however for customers specially requiring the availability, resiliency, and/or scale that ReFS provides, Microsoft supports ReFS for use under the following configurations and scenarios…”

I added emphasis on the part that I want you to consider. The sentence itself makes you think that they’ll go on to list some usages, but they only list one: “backup target”. The other items on their list only talk about the storage configuration. So, we need to dig back into the sentence and pull out those three descriptors to help us decide: “availability”, “resiliency”, and “scale”. You can toss out the first two right away — you should not focus on storage availability and resiliency inside a VM. That leaves us with “scale”. So, really big volumes and really big files. Remember, that means hundreds of terabytes and up.

For a more accurate decision, read through the feature comparisons. If any application that you want to use inside a guest needs features only found on NTFS, use NTFS. Personally, I still use NTFS inside guests almost exclusively. ReFS needs Storage Spaces to do its best work, and Storage Spaces does its best work at the physical layer.

Combining ReFS with NTFS across Hyper-V Host and Guests

Keep in mind that the file system inside a guest has no bearing on the host’s file system, and vice versa. As far as Hyper-V knows, VHDXs attached to virtual machines are nothing other than a bundle of data blocks. You can use any combination that works.

Go to Original Article
Author: Eric Siron

What Exactly are Proximity Placement Groups in Azure?

In this blog post, you’ll learn all about Azure Proximity Groups, why they are necessary and how to use them.

What are Azure Proximity Placement Groups?

Microsoft defines Azure Proximity Placement groups as an Azure Virtual Machine logical grouping capability that you can use to decrease the inter-VM network latency associated with your applications (Microsoft announcement blog post). But what does that actually mean?

When you look at VM placement in Azure and the reduction of latency between VMs, you can place VMs in the same region and the same Availability Zone. With that, they are in the same group of physical datacenters. To be honest, with the growing Azure footprint, these datacenters can still be a few kilometers away from each other.

That may impact the latency of your application and especially application with a need for latency within the nanosecond ranges will be highly impacted. Such applications can be for example banking applications for low latency trading or financial stock operations.

Proximity Placement Groups bring together as near as possible to achieve the lowest latency possible. Following scenarios are eligible for Proximity Placement Groups:

  • Low latency between stand-alone VMs.
  • Low Latency between VMs in a single availability set or a virtual machine scale set.
  • Low latency between stand-alone VMs, VMs in multiple Availability Sets, or multiple scale sets. You can have multiple compute resources in a single placement group to bring together a multi-tiered application.
  • Low latency between multiple application tiers using different hardware types. For example, running the backend using M-series in an availability set and the front end on a D-series instance, in a scale set, in a single proximity placement group.

All VMs must be in a single VNet like shown in the drawing below:

Virtual Network Scale Set and Availability Set

I wouldn`t suggest single VMs for production workloads on Azure. Always use a cluster within an Availability Set or a VM Scale Set.

How does that look like in an Azure Datacenter environment?

The following drawing shows the placement of a VM without Proximity Groups:

Placement of a VM without Proximity Groups

With Proximity Groups for a single VM, it could look like the following:

Proximity Groups for a single VM

When you use availability sets for your VMs, the distribution can look like the following.

Distribution availability sets for VMs

With that said, let’s learn how to set up a Proximity Placement Group.

How to set up a Proximity Placement Group

To set up a Proximity Placement Group is pretty easy.

Look for Proximity Placement Groups in the Portal:

Proximity Placement Groups in the Portal

Add a new group:

Create a new Proximity Placement Group

Select Subscription, Resource Group, Region and the name and create the group:

Proximity Placement Group Settings

When you now create a VM you can select the Proximity Placement Group in the advanced tap:

Proximity Placement Group advanced settings

There is also the option to use PowerShell to deploy Proximity Groups.

Conclusion

The information in this blog post explains Proximity Groups and the ways to use them but if you’re stuck or if there’s something you need further explanation about, let me know in the comments below and I’ll get back to you!

Go to Original Article
Author: Florian Klaffenbach

What Exactly is Azure Dedicated Host?

In this blog post, we’ll become more familiar with a new Azure service called Azure Dedicated Hosts. Microsoft announced the service as preview some time ago and will go general-available with it in the near future.

Microsoft Azure Dedicated Host allows customers to run their virtual machines on a dedicated host not shared with other customers. While in a regular virtual machine scenario different customers or tenants share the same hosts, with Dedicated Host, a customer does no longer share the hardware. The picture below illustrates the setup.

Azure Dedicated Hosts

With a Dedicated Host, Microsoft wants to address customer concerns regarding compliance, security, and regulations, which could come up when running on a shared physical server. In the past, there was only one option to get a dedicated host in Azure. The option was to use very large instances like a D64s v3 VM size. These instances were so large that they consumed one host, and the placement of other VMs was not possible.

To be honest here, with the improvements in machine placement, larger hosts, and with that a much better density, there was no longer a 100% guaranty that the host is still dedicated. Another thing regarding instances is they are extremely expensive, as you can see in the screenshot from the Azure Price Calculator.

Azure price calculator

How to Setup a Dedicated Host in Azure

The setup of a dedicated host is pretty easy. First, you need to create a host group with your preferences for availability, like Availability Zones and Number of Fault Domains. You also need to decide for a Host Region, Group Name, etc.

How To Setup A Dedicated Host In Azure

After you created the host group, you can create a host within the group. Within the current preview, only VM Type Ds3 and Es3 Family are available to choose from. Microsoft will add more options soon.

Create dedicated host

More Details About Pricing

As you can see in the screenshot, Microsoft added the option to use Azure Hybrid Use Benefits for Dedicated Host. That means you can use your on-prem Windows Server and SQL Server licenses with Software Assurance to reduce your costs in Azure.

Azure Hybrid Use Benefits pricing

Azure Dedicated Host also gives you more insides into the host like:

  • The underlying hardware infrastructure (host type)
  • Processor brand, capabilities, and more
  • Number of cores
  • Type and size of the Azure Virtual Machines you want to deploy

An Azure Customer can control all host-level platform maintenance initiated by Azure, like OS updates. An Azure Dedicated Host gives you the option to schedule maintenance windows within 35 days where these updates are applied to your host system. During this self-maintenance window, customers can apply maintenance to hosts at their own convenience.

So looking a bit deeper in that service, Azure becomes more like a traditional hosting provider who gives a customer a very dynamic platform.

The following screenshot shows the current pricing for a Dedicated Host.

Azure Dedicated Host pricing details

Following virtual machine types can be run on a dedicated host.

Virtual Machines on a Dedicated Host

Currently, you have a soft limit from 3000 vCPUs for a dedicated host per region. That limit can be enhanced by submitting a support ticket.

When Would I Use A Dedicated Host?

In most cases, you would choose a dedicated host because of compliance reasons. You may not want to share a host with other customers. Another reason could be that you want a guaranteed CPU architecture and type. If you place your VMs on the same host, then it is guaranteed that it will have the same architecture.

Further Reading

Microsoft already published a lot of documentation and blogs about the topic so you can deepen your knowledge about Dedicated Host.

Resource #1: Announcement Blog and FAQ 

Resource #2: Product Page 

Resource #3: Introduction Video – Azure Friday “An introduction to Azure Dedicated Hosts | Azure Friday”

Go to Original Article
Author: Florian Klaffenbach

Microsoft Azure Peering Services Explained

In this blog post, you’ll discover everything you need to know about Microsoft Azure Peering Services, a networking service introduced during Ignite 2019.

Microsoft explains the service within their documentation as follows:

Azure Peering Service is a networking service that enhances customer connectivity to Microsoft cloud services such as Office 365, Dynamics 365, software as a service (SaaS) services, Azure, or any Microsoft services accessible via the public internet. Microsoft has partnered with internet service providers (ISPs), internet exchange partners (IXPs), and software-defined cloud interconnect (SDCI) providers worldwide to provide reliable and high-performing public connectivity with optimal routing from the customer to the Microsoft network.

To be honest, Microsoft explained the service well, but what’s behind the explanation is much more complex. To understand Azure Peering Services and its benefits, you need to understand how peering, routing, and connectivity for internet providers work.

What Are Peering And Transit?

In the internet and network provider world, peering is an interconnection of separated and independent internet networks to exchange traffic between users within their respective networks. Peering or partnering is a free agreement between two providers. Normally both providers only pay their cross-connect in the datacenter and their colocation space. Traffic is not paid by any party. Instead, there are special agreements, e.g. from smaller to larger providers.

Normally you have the following agreements:

  • between equal providers or peering partners – traffic upload and download between these two networks is free for both parties
  • a larger provider and a smaller provider – the smaller provider needs to pay a fee for the transit traffic to the larger network provider
  • providers who transit another network to reach a 3rd party network (upstream service) – the provider using the upstream needs to pay a fee for the transit traffic to the upstream provider

An agreement by two or more networks to peer is instantiated by a physical interconnection of the networks, an exchange of routing information through the Border Gateway Protocol (BGP) routing protocol and, in some special cases, a formalized contractual document. These documents are called peering policies and Letter of Authorization or LOA.

Fun Fact – As a peering partner for Microsoft, you can easily configure the peering through the Azure Portal as a free service.

As you can see in the screenshot, Microsoft is very restrictive with their routing and peering policies. That prevents unwanted traffic and protects Microsoft customers when Peering for Azure ExpressRoute (AS12076).

Routing and peering policies Azure express route.

Now let’s talk a bit about the different types of peering.

Public Peering

Public peering is configured over the shared platform of Internet Exchange Point. Internet Exchanges charge a port and/or member fee for using their platform for interconnect.

If you are a small cloud or network provider with less infrastructure, the peering via an Internet Exchange is a good place to start. As a big player on the market, it is a good choice because you are also reaching smaller networks on a short path. The picture below shows an example of those prices. I took my example from the Berlin Commercial Internet Exchange Pricing Page.

Berlin Commercial Internet Exchange Pricing

Hurricane Electric offers a tool that can give you a peering map and more information about how a provider is publicly peered with other providers, but you will not get a map from the private peering there. The picture below shows you some examples for Microsoft AS 8075.

Microsoft AS 8075 peering

Private Peering

Private peering is a direct physical link between two networks. Commonly the peering is done by one or more 10GBE or 100GBE links. The connection is made from only one network to another, for which any site pays a set fee to the owner of the infrastructure or colocation that is used. Those costs are usually crossconnect within the datacenter. That makes private peering a good choice when you need to send large volumes of traffic to one specific network. That’s a much cheaper option when looking on the pricing per transferred gigabyte between both networks than with public peering. When peering private with providers you may need to follow some peering policies though.

A good provider also has a looking glass where you can get more insights into peerings, but we will look at this later on.

Transit and Upstream

When someone is using Transit, the provider itself has no access to the destination network. Therefore he needs to leverage other networks or network providers to reach the destination network and destination service. Those providers who give the transit are known as transit providers, with larger networks being considered as Tier 1 networks. As a network provider for cloud customers like Microsoft, you don’t want any transit routing. In the first place, you normally have high costs for transitive routing through other networks, and what is worse, you add additional latency and uncontrollable space between your customers and the cloud services. So, the first thing when handling cloud customers, avoid transit routing and peer yourself with cloud providers either through private or public network interconnect at interconnect locations.

That is one reason why Microsoft is working with Internet Exchanges and Network and Internet Providers to enable Services like Microsoft Azure Peering. It should give customers more control over how they reach Microsoft Services incl. Azure, Microsoft 365, xBox etc. To understand the impact, you also need to know about Service Provider Routing. That’s how we will follow up in the next part of the post.

How Internet Service Providers Route your Traffic?

When you look at routing, there are mostly only two options within a carrier network. The first one is cold potato or centralized routing. With cold potato routing, a provider keeps the traffic as long as possible within his network before he sends it to another 3rd party. The other option is hot potato routing or decentralized routing. Here the provider sends the traffic as fast as possible to the 3rd party, mostly in the same metro.

The picture below illustrates the difference between hot and cold potato routing.

cold and hot potato routing differences

As you can see in the drawing, the cold potato routing takes a longer distance through the provider network and with that to your target, e.g. Microsoft.

Those routing configurations have a large impact on your cloud performance because every kilometer distance adds latency. The actual number is 1ms in latency added per every 200 kilometers of distance. As a result, you will see an impact on the likes of voice quality during Teams Meetings or synchronization issues for Backups to Azure.

Microsoft has a big agenda to address that issue for their customers and the rest of the globe. You can read more about the plans in articles from Yousef Khalidi, Cop. Vice President Microsoft Networking.

Now let’s start with Peering Services and how it can change the game.

What is Azure Peering Services and How it Solves the Issue?

When you look at how the service is designed, you can see that it leverages all of Microsoft Provider Peering with AS 8075. Together with the Microsoft Azure Peering Services Partners, Microsoft can change the default routing and transit behavior to their services when using a partner provider.

Following the picture below, you can setup a routing so that traffic from your network to Azure (or other networks) now uses the Microsoft Global Backbone instead of a transit provider without any SLA.

What is Azure Peering Services

With that service enabled, the performance to Microsoft Services will increase and the latency will be reduced depending on the provider. As you can expect, services like Office 365 or Azure AD will profit from that Azure Service but there is more. When you for example build your backbone on the Microsoft Global Transit Architecture with Azure Virtual WAN and leverage Internet Connections of these certain Providers and Internet Exchange Partners, you will directly boost your network performance and you will have a pseudo-private network. The reason for that is because you now leverage private or public peering with route restrictions. Your backbone traffic will now bypass the regular Internet and flow through the Microsoft Global Backbone from A to B.

Let me try to explain it with a drawing.

Microsoft global backbone network

in addition to better performance, you will also get an additional layer of monitoring. While the regular internet is a black box regarding dataflow, performance, etc. with Microsoft Azure Peering Services you get fully operational monitoring of your wide area network through the Microsoft Backbone.

You can find this information in the Azure Peering Services Telemetry Data.

The screenshot below shows the launch partner of Azure Peering Services.

Launch partner of Azure Peering Services

When choosing a network provider for your access to Microsoft, you should follow this guideline:

  • Choose a provider well peered with Microsoft
  • Choose a provider with hot potato routing to Microsoft
  • Don`t let the price decide the provider, a good network has costs
  • Choose Dedicated Internet Access before regular Internet Connection any time possible
  • If possible use locale providers instead of global ones
  • A good provider always has a looking glass or can provide you with default routes between a city location and other peering partners. If not, it is not a good provider to choose

So, let’s learn about the setup of the service.

How to configure Azure Peering Services?

First, you need to understand that like with Azure ExpressRoute, there are two sites to contact and configure.

You need to follow the steps below to establish a Peering Services connection.

Step 1: Customer provision the connectivity from a connectivity partner (no interaction with Microsoft). With that, you get an Internet provider who is well connected to Microsoft and meets the technical requirements for performant and reliable connectivity to Microsoft. Again you should check the Partnerlist.
Step 2: Customer registers locations into the Azure portal. A location is defined by: ISP/IXP Name, Physical location of the customer site (state level), IP Prefix given to the location by the Service Provider or the enterprise. As a service from Microsoft, you now get Telemetry data like Internet Routes monitoring and traffic prioritization from Microsoft to the user’s closest edge location.

The registration of the locations happens within the Azure Portal.

Currently, you need to register for the public beta first. That happens with some simple PowerShell commands.

Using Azure PowerShell 

Using Azure CLI

Afterward, you can configure the service using the Azure Portal, Azure PowerShell, or Azure CLI.

You can find the responsive guide here.

After the Service went General Available (GA), customers also received SLAs on the Peering and Telemetry Service. Currently, there is no SLA and no support if you use the services in production.

Peering and Telemetry service

Closing Thoughts

From reading this article you now have a better understanding of Microsoft Azure Peering Services and its use, peering between providers, and the routing and traffic behavior within the internet. When digging deeper into Microsoft Peering Services, you now should be able to develop some architectures and ideas on how to use that service.

If you have any providers which are not aware about that Service or direct Peering with Microsoft AS 8075, point them to http://peering.azurewebsites.net/ or let them drop an email to [email protected]

When using the BGP Tools from Hurricane Electric, you should get info about some providers, peering with Microsoft. One thing you need to know, most of the 3500 Network Partners of Microsoft are peering private with Microsoft. The Hurricane tools and only observe the public peering partners.

Go to Original Article
Author: Florian Klaffenbach

How to Use ASR and Hyper-V Replica with Failover Clusters

In the third and final post of this blog series, we will evaluate Microsoft’s replication solutions for multi-site clusters and how to integrate basic backup/DR with them. This includes Hyper-V Replica, Azure Site Recovery, and DFS Replication. In the first part of the series, you learned about setting up failover clusters to work with DR solutions and in the last post, you learned about disk replication considerations from third-party storage vendors. The challenge with the solutions that we previously discussed is that they typically require third-party hardware or software. Let’s look at the basic technologies provided by Microsoft to reduce these upfront fixed costs.

Note: The features talked about in this article are native Microsoft features with a baseline level of functionality. Should you require over and above what is required here you should look at a third-party backup/replication product such as Altaro VM Backup.

Multi-Site Disaster Recovery with Windows Server DFS Replication (DFSR)

DFS Replication (DFSR) is a Windows Server role service that has been around for many releases. Although DFSR is built into Windows Server and is easy to configure, it is not supported for multi-site clustering. This is because the replication of files only happens when a file is closed, so it works great for file servers hosting documents. However, it is not designed to work with application workloads where the file is kept open, such as SQL databases or Hyper-V VMs. Since these file types will only close during a planned failover or unplanned crash, it is hard to keep the data consistent at both sites. This means that if your first site crashes, the data will not be available at the second site, so DFSR should not be considered as a possible solution.

Multi-Site Disaster Recovery with Hyper-V Replica

The most popular Microsoft DR solution is Hyper-V Replica which is a built-in Hyper-V feature and available to Windows Server customers at no additional cost. It copies the virtual hard disk (VHD) file of a running virtual machine from one host to a second host in a different location. This is an excellent low-cost solution to replicate your data between your primary and secondary sites and even allows you to do extended (“chained”) replication to a third location. However, it is limited in that is only replicates Hyper-V virtual machines (VMs) so it cannot be used for any other application unless they are virtualized and running inside a VM. The way it works is that any changes to the VHD file are tracked by a log file, which is copied to an offline VM/VHD in the secondary site. This also means that replication is also asynchronous, allowing copies to be sent every 30 seconds, 5 minutes or 15 minutes. While this means that there is no distance limitation between the sites, there could be some data loss if any in-memory data has not been written to the disk or if there is a crash between replication cycles.

Two Clusters Replicate Data between Sites with Hyper-V Replica

Figure 1 – Two Clusters Replicate Data between Sites with Hyper-V Replica

Hyper-V Replica allows for replication between standalone Hyper-V hosts or between separate clusters, or any combination.  This means that instead of stretching a single cluster across two sites, you will set up two independent clusters. This also allows for a more affordable solution by letting businesses set up a cluster in their primary site and a single host in their secondary site that will be used only for mission-critical applications. If the Hyper-V Replica is deployed on a failover cluster, a new clustered workload type is created, known as the Hyper-V Replica Broker. This basically makes the replication service highly-available, so that if a node crashes, the replication engine will failover to a different node and continue to copy logs to the secondary site, providing greater resiliency.

Another powerful feature of Hyper-V Replica is its built-in testing, allowing you to simulate both planned and unplanned failures to the secondary site.  While this solution will meet the needs of most virtualized datacenters, it is also important to remember that there are no integrity checks in the data which is being copied between the VMs. This means that if a VM becomes corrupted or is infected with a virus, that same fault will be sent to its replica. For this reason, backups of the virtual machine are still a critical part of standard operating procedure. Additionally, this Altaro blog notes that Hyper-V Replica has other limitations compared to backups when it comes to retention, file space management, keeping separate copies, using multiple storage locations, replication frequency and may have a higher total cost of ownership. If you are using a multi-site DR solution which uses two clusters, then make sure that you are taking and storing backups in both sites, so that you can recover your data at either location. Also make sure that your backup provider supports clusters, CSV disks, and Hyper-V replica, however, this is now standard in the industry.

Multi-Site Disaster Recovery with Azure Site Recovery (ASR)

All of the aforementioned solutions require you to have a second datacenter, which simply is not possible for some businesses.  While you could rent rack space from a cohosting facility, the economics just may not make sense. Fortunately, the Microsoft Azure public cloud can now be used as your disaster recovery site using Azure Site Recovery (ASR). This technology works with Hyper-V Replica, but instead of copying your VMs to a secondary site, you are pushing them to a nearby Microsoft datacenter. This technology still has the same limitations of Hyper-V Replica, including the replication frequency, and furthermore you do not have access to the physical infrastructure of your DR site in Azure. The replicated VM can run on the native Azure infrastructure, or you can even build a virtualized guest cluster, and replicate to this highly-available infrastructure.

While ASR is a significantly cheaper solution than maintaining your own hardware in the secondary site, it is not free. You have to pay for the service, the storage of your virtual hard disks (VHDs) in the cloud, and if you turn on any of those VMs, you will pay for standard Azure VM operating costs.

If you are using ASR, you should follow the same backup best practices as mentioned in the earlier Hyper-V Replica section. The main difference will be that you should use an Azure-native backup solution to protect your replicated VHDs in Azure, in case you switch over the Azure VMs for any extended period of time.

Conclusion

From reviewing this blog series, you should be equipped to make the right decisions when planning your disaster recovery solution using multi-site clustering.  Start by understanding your site restrictions and from there you can plan your hardware needs and storage replication solution.  There are a variety of options that have tradeoffs between a higher price with more features to cost-effective solutions using Microsoft Azure, but have limited control. Even after you have deployed this resilient infrastructure, keep in mind that there are still three main reasons why disaster recovery plans fail:

  • The detection of the outage failed, so the failover to the secondary datacenter never happens.
  • One component in the DR failover process does not work, which is usually due to poor or infrequent testing.
  • There was no automation or some dependency on humans during the process, which failed as humans create a bottleneck and are unreliable during a disaster.

This means that whichever solution you choose, make sure that it is well tested with quick failure detection and try to eliminate all dependencies on humans! Good luck with your deployment and please post any questions that you have in the comments section of this blog.


Go to Original Article
Author: Symon Perriman

Free Script – Convert Legacy Teamed Hyper-V vSwitch to SET

<#

.SYNOPSIS

Converts LBFO+Virtual Switch combinations to switch-embedded teams.

.DESCRIPTION

Converts LBFO+Virtual Switch combinations to switch-embedded teams.

Performs the following steps:

1. Saves information about virtual switches and management OS vNICs (includes IPs, QoS settings, jumbo frame info, etc.)

2. If system belongs to a cluster, sets to maintenance mode

3. Disconnects attached virtual machine vNICs

4. Deletes the virtual switch

5. Deletes the LBFO team

6. Creates switch-embedded team

7. Recreates management OS vNICs

8. Reconnects previously-attached virtual machine vNICs

9. If system belongs to a cluster, ends maintenance mode

If you do not specify any overriding parameters, the new switch uses the same settings as the original LBFO+team.

.PARAMETER Id

The unique identifier(s) for the virtual switch(es) to convert.

.PARAMETER Name

The name(s) of the virtual switch(es) to convert.

.PARAMETER VMSwitch

The virtual switch(es) to convert.

.PARAMETER NewName

Name(s) to assign to the converted virtual switch(es). If blank, keeps the original name.

.PARAMETER UseDefaults

If specified, uses defaults for all values on the converted switch(es). If not specified, uses the same parameters as the original LBFO+switch or any manually-specified parameters.

.PARAMETER LoadBalancingAlgorithm

Sets the load balancing algorithm for the converted switch(es). If not specified, uses the same setting as the original LBFO+switch or the default if UseDefaults is set.

.PARAMETER MinimumBandwidthMode

Sets the desired QoS mode for the converted switch(es). If not specified, uses the same setting as the original LBFO+switch or the default if UseDefaults is set.

None: No network QoS

Absolute: minimum bandwidth values specify bits per second

Weight: minimum bandwidth values range from 1 to 100 and represent percentages

Default: use system default

WARNING: Changing the QoS mode may cause guest vNICS to fail to re-attach and may inhibit Live Migration. Use carefully if you have special QoS settings on guest virtual NICs.

.PARAMETER Notes

A note to associate with the converted switch(es). If not specified, uses the same setting as the original LBFO+switch or the default if UseDefaults is set.

.PARAMETER Force

If specified, bypasses confirmation.

.NOTES

Author: Eric Siron

Version 1.0, December 22, 2019

Released under MIT license

.EXAMPLE

ConvertTo-SwitchEmbeddedTeam

Converts all existing LBFO+switch combinations to switch embedded teams. Copies settings from original switches and management OS virtual NICs to new switch and vNICs.

.EXAMPLE

ConvertTo-SwitchEmbeddedTeam -Name vSwitch

Converts the LBFO+switch combination of the virtual switch named “vSwitch” to a switch embedded teams. Copies settings from original switch and management OS virtual NICs to new switch and vNICs.

.EXAMPLE

ConvertTo-SwitchEmbeddedTeam -Force

Converts all existing LBFO+team combinations without prompting.

.EXAMPLE

ConvertTo-SwitchEmbeddedTeam -NewName NewSET

If the system has one LBFO+switch, converts it to a switch-embedded team with the name “NewSET”.

If the system has multiple LBFO+switch combinations, fails due to mismatch (see next example).

.EXAMPLE

ConvertTo-SwitchEmbeddedTeam -NewName NewSET1, NewSET2

If the system has two LBFO+switches, converts them to switch-embedded team with the name “NewSET1” and “NEWSET2”, IN THE ORDER THAT GET-VMSWITCH RETRIEVES THEM.

.EXAMPLE

ConvertTo-SwitchEmbeddedTeam OldSwitch1, OldSwitch2 -NewName NewSET1, NewSET2

Converts the LBFO+switches named “OldSwitch1” and “OldSwitch2” to SETs named “NewSET1” and “NewSET2”, respectively.

.EXAMPLE

ConvertTo-SwitchEmbeddedTeam -UseDefaults

Converts all existing LBFO+switch combinations to switch embedded teams. Discards non-default settings for the switch and Hyper-V-related management OS vNICs. Keeps IP addresses and advanced settings (ex. jumbo frames).

.EXAMPLE

ConvertTo-SwitchEmbeddedTeam -MinimumBandwidthMode Weight

Converts all existing LBFO+switch combinations to switch embedded teams. Forces the new SET to use “Weight” for its minimum bandwidth mode.

WARNING: Changing the QoS mode may cause guest vNICS to fail to re-attach and may inhibit Live Migration. Use carefully if you have special QoS settings on guest virtual NICs.

.LINK

https://ejsiron.github.io/Posher-V/ConvertTo-SwitchEmbeddedTeam

#>

#Requires -RunAsAdministrator

#Requires -Module Hyper-V

#Requires -Version 5

[CmdletBinding(DefaultParameterSetName = ‘ByName’, ConfirmImpact = ‘High’)]

param(

[Parameter(Position = 1, ParameterSetName = ‘ByName’)][String[]]$Name = @(),

[Parameter(Position = 1, ParameterSetName = ‘ByID’, Mandatory = $true)][System.Guid[]]$Id,

[Parameter(Position = 1, ParameterSetName = ‘BySwitchObject’, Mandatory = $true)][Microsoft.HyperV.PowerShell.VMSwitch[]]$VMSwitch,

[Parameter(Position = 2)][String[]]$NewName = @(),

[Parameter()][Switch]$UseDefaults,

[Parameter()][Microsoft.HyperV.PowerShell.VMSwitchLoadBalancingAlgorithm]$LoadBalancingAlgorithm,

[Parameter()][Microsoft.HyperV.PowerShell.VMSwitchBandwidthMode]$MinimumBandwidthMode,

[Parameter()][String]$Notes = ,

[Parameter()][Switch]$Force

)

BEGIN

{

Set-StrictMode -Version Latest

$ErrorActionPreference = [System.Management.Automation.ActionPreference]::Stop

$IsClustered = $false

if(Get-CimInstance -Namespace root -ClassName __NAMESPACE -Filter ‘Name=”MSCluster”‘)

{

$IsClustered = [bool](Get-CimInstance -Namespace root/MSCluster -ClassName mscluster_cluster -ErrorAction SilentlyContinue)

$ClusterNode = Get-CimInstance -Namespace root/MSCluster -ClassName MSCluster_Node -Filter (‘Name=”{0}”‘ -f $env:COMPUTERNAME)

}

function Get-CimAdapterConfigFromVirtualAdapter

{

param(

[Parameter()][psobject]$VNIC

)

$VnicCim = Get-CimInstance -Namespace root/virtualization/v2 -ClassName Msvm_InternalEthernetPort -Filter (‘Name=”{0}”‘ -f $VNIC.AdapterId)

$VnicLanEndpoint1 = Get-CimAssociatedInstance -InputObject $VnicCim -ResultClassName Msvm_LANEndpoint

$NetAdapter = Get-CimInstance -ClassName Win32_NetworkAdapter -Filter (‘GUID=”{0}”‘ -f $VnicLANEndpoint1.Name.Substring(($VnicLANEndpoint1.Name.IndexOf(‘{‘))))

Get-CimAssociatedInstance -InputObject $NetAdapter -ResultClassName Win32_NetworkAdapterConfiguration

}

function Get-AdvancedSettingsFromAdapterConfig

{

param(

[Parameter()][psobject]$AdapterConfig

)

$MSFTAdapter = Get-CimInstance -Namespace root/StandardCimv2 -ClassName MSFT_NetAdapter -Filter (‘InterfaceIndex={0}’ -f $AdapterConfig.InterfaceIndex)

Get-CimAssociatedInstance -InputObject $MSFTAdapter -ResultClassName MSFT_NetAdapterAdvancedPropertySettingData

}

class NetAdapterDataPack

{

[System.String]$Name

[System.String]$MacAddress

[System.Int64]$MinimumBandwidthAbsolute = 0

[System.Int64]$MinimumBandwidthWeight = 0

[System.Int64]$MaximumBandwidth = 0

[System.Int32]$VlanId = 0

[Microsoft.Management.Infrastructure.CimInstance]$NetAdapterConfiguration

[Microsoft.Management.Infrastructure.CimInstance[]]$AdvancedProperties

[Microsoft.Management.Infrastructure.CimInstance[]]$IPAddresses

[Microsoft.Management.Infrastructure.CimInstance[]]$Gateways

NetAdapterDataPack([psobject]$VNIC)

{

$this.Name = $VNIC.Name

$this.MacAddress = $VNIC.MacAddress

if ($VNIC.BandwidthSetting -ne $null)

{

$this.MinimumBandwidthAbsolute = $VNIC.BandwidthSetting.MinimumBandwidthAbsolute

$this.MinimumBandwidthWeight = $VNIC.BandwidthSetting.MinimumBandwidthWeight

$this.MaximumBandwidth = $VNIC.BandwidthSetting.MaximumBandwidth

}

$this.VlanId = [System.Int32](Get-VMNetworkAdapterVlan -VMNetworkAdapter $VNIC).AccessVlanId

$this.NetAdapterConfiguration = Get-CimAdapterConfigFromVirtualAdapter -VNIC $VNIC

$this.AdvancedProperties = @(Get-AdvancedSettingsFromAdapterConfig -AdapterConfig $this.NetAdapterConfiguration  | Where-Object -FilterScript { (-not [String]::IsNullOrEmpty($_.DefaultRegistryValue)) -and (-not [String]::IsNullOrEmpty([string]($_.RegistryValue))) -and (-not [String]::IsNullOrEmpty($_.DisplayName)) -and ($_.RegistryValue[0] -ne $_.DefaultRegistryValue) })

# alternative to the below: use Get-NetIPAddress and Get-NetRoute, but they treat empty results as errors

$this.IPAddresses = @(Get-CimInstance -Namespace root/StandardCimv2 -ClassName MSFT_NetIPAddress -Filter (‘InterfaceIndex={0} AND PrefixOrigin=1’ -f $this.NetAdapterConfiguration.InterfaceIndex))

$this.Gateways = @(Get-CimInstance -Namespace root/StandardCimv2 -ClassName MSFT_NetRoute -Filter (‘InterfaceIndex={0} AND Protocol=3’ -f $this.NetAdapterConfiguration.InterfaceIndex)) # documentation says Protocol=2 for NetMgmt, testing shows otherwise

}

}

class SwitchDataPack

{

[System.String]$Name

[Microsoft.HyperV.PowerShell.VMSwitchBandwidthMode]$BandwidthReservationMode

[System.UInt64]$DefaultFlow

[System.String]$TeamName

[System.String[]]$TeamMembers

[System.UInt32]$LoadBalancingAlgorithm

[NetAdapterDataPack[]]$HostVNICs

SwitchDataPack(

[psobject]$VSwitch,

[Microsoft.Management.Infrastructure.CimInstance]$Team,

[System.Object[]]$VNICs

)

{

$this.Name = $VSwitch.Name

$this.BandwidthReservationMode = $VSwitch.BandwidthReservationMode

switch ($this.BandwidthReservationMode)

{

[Microsoft.HyperV.PowerShell.VMSwitchBandwidthMode]::Absolute { $this.DefaultFlow = $VSwitch.DefaultFlowMinimumBandwidthAbsolute }

[Microsoft.HyperV.PowerShell.VMSwitchBandwidthMode]::Weight { $this.DefaultFlow = $VSwitch.DefaultFlowMinimumBandwidthWeight }

default { $this.DefaultFlow = 0 }

}

$this.TeamName = $Team.Name

$this.TeamMembers = ((Get-CimAssociatedInstance -InputObject $Team -ResultClassName MSFT_NetLbfoTeamMember).Name)

$this.LoadBalancingAlgorithm = $Team.LoadBalancingAlgorithm

$this.HostVNICs = $VNICs

}

}

function Set-CimAdapterProperty

{

param(

[Parameter()][System.Object]$InputObject,

[Parameter()][System.String]$MethodName,

[Parameter()][System.Object]$Arguments,

[Parameter()][System.String]$Activity,

[Parameter()][System.String]$Url

)

Write-Verbose -Message $Activity

$CimResult = Invoke-CimMethod -InputObject $InputObject -MethodName $MethodName -Arguments $Arguments -ErrorAction Continue

if ($CimResult -and $CimResult.ReturnValue -gt 0 )

{

Write-Warning -Message (‘CIM error from operation: {0}. Consult {1} for error code {2}’ -f $Activity, $Url, $CimResult.ReturnValue) -WarningAction Continue

}

}

}

PROCESS

{

$VMSwitches = New-Object System.Collections.ArrayList

$SwitchRebuildData = New-Object System.Collections.ArrayList

switch ($PSCmdlet.ParameterSetName)

{

‘ByID’

{

$VMSwitches.AddRange($Id.ForEach( { Get-VMSwitch -Id $_ -ErrorAction SilentlyContinue }))

}

‘BySwitchObject’

{

$VMSwitches.AddRange($VMSwitch.ForEach( { $_ }))

}

default # ByName

{

$NameList = New-Object System.Collections.ArrayList

$NameList.AddRange($Name.ForEach( { $_.Trim() }))

if ($NameList.Contains() -or $NameList.Contains(‘*’))

{

$VMSwitches.AddRange(@(Get-VMSwitch -ErrorAction SilentlyContinue))

}

else

{

$VMSwitches.AddRange($NameList.ForEach( { Get-VMSwitch -Name $_ -ErrorAction SilentlyContinue }))

}

}

}

if ($VMSwitches.Count)

{

$VMSwitches = @(Select-Object -InputObject $VMSwitches -Unique)

}

else

{

throw(‘No virtual switches match the provided criteria’)

}

Write-Progress -Activity ‘Pre-flight’ -Status ‘Verifying operating system version’ -PercentComplete 5 -Id 1

Write-Verbose -Message ‘Verifying operating system version’

$OSVersion = [System.Version]::Parse((Get-CimInstance -ClassName Win32_OperatingSystem).Version)

if ($OSVersion.Major -lt 10)

{

throw(‘Switch-embedded teams not supported on host operating system versions before 2016’)

}

Write-Progress -Activity ‘Pre-flight’ -Status ‘Loading virtual VMswitches’ -PercentComplete 15 -Id 1

if ($NewName.Count -gt 0 -and $NewName.Count -ne $VMSwitches.Count)

{

$SwitchNameMismatchMessage = ‘Switch count ({0}) does not match NewName count ({1}).’ -f $VMSwitches.Count, $NewName.Count

if ($NewName.Count -lt $VMSwitches.Count)

{

$SwitchNameMismatchMessage += ‘ If you wish to rename some VMswitches but not others, specify an empty string for the VMswitches to leave.’

}

throw($SwitchNameMismatchMessage)

}

Write-Progress -Activity ‘Pre-flight’ -Status ‘Validating virtual switch configurations’ -PercentComplete 25 -Id 1

Write-Verbose -Message ‘Validating virtual switches’

foreach ($VSwitch in $VMSwitches)

{

try

{

Write-Progress -Activity (‘Validating virtual switch “{0}”‘ -f $VSwitch.Name) -Status ‘Switch is external’ -PercentComplete 25 -ParentId 1

Write-Verbose -Message (‘Verifying that switch “{0}” is external’ -f $VSwitch.Name)

if ($VSwitch.SwitchType -ne [Microsoft.HyperV.PowerShell.VMSwitchType]::External)

{

Write-Warning -Message (‘Switch “{0}” is not external, skipping’ -f $VSwitch.Name)

continue

}

Write-Progress -Activity (‘Validating virtual switch “{0}”‘ -f $VSwitch.Name) -Status ‘Switch is not a SET’ -PercentComplete 50 -ParentId 1

Write-Verbose -Message (‘Verifying that switch “{0}” is not already a SET’ -f $VSwitch.Name)

if ($VSwitch.EmbeddedTeamingEnabled)

{

Write-Warning -Message (‘Switch “{0}” already uses SET, skipping’ -f $VSwitch.Name)

continue

}

Write-Progress -Activity (‘Validating virtual switch “{0}”‘ -f $VSwitch.Name) -Status ‘Switch uses LBFO’ -PercentComplete 75 -ParentId 1

Write-Verbose -Message (‘Verifying that switch “{0}” uses an LBFO team’ -f $VSwitch.Name)

$TeamAdapter = Get-CimInstance -Namespace root/StandardCimv2 -ClassName MSFT_NetLbfoTeamNic -Filter (‘InterfaceDescription=”{0}”‘ -f $VSwitch.NetAdapterInterfaceDescription)

if ($TeamAdapter -eq $null)

{

Write-Warning -Message (‘Switch “{0}” does not use a team, skipping’ -f $VSwitch.Name)

continue

}

if ($TeamAdapter.VlanID)

{

Write-Warning -Message (‘Switch “{0}” is bound to a team NIC with a VLAN assignment, skipping’ -f $VSwitch.Name)

continue

}

}

catch

{

Write-Warning -Message (‘Switch “{0}” failed validation, skipping. Error: {1}’ -f $VSwitch.Name, $_.Exception.Message)

continue

}

finally

{

Write-Progress -Activity (‘Validating virtual switch “{0}”‘ -f $VSwitch.Name) -Completed -ParentId 1

}

Write-Progress -Activity (‘Loading information from virtual switch “{0}”‘ -f $VSwitch.Name) -Status ‘Team NIC’ -PercentComplete 25 -ParentId 1

Write-Verbose -Message ‘Loading team’

$Team = Get-CimAssociatedInstance -InputObject $TeamAdapter -ResultClassName MSFT_NetLbfoTeam

Write-Progress -Activity (‘Loading information from virtual switch “{0}”‘ -f $VSwitch.Name) -Status ‘Host virtual adapters’ -PercentComplete 50 -ParentId 1

Write-Verbose -Message ‘Loading management adapters connected to this switch’

$HostVNICs = Get-VMNetworkAdapter -ManagementOS -SwitchName $VSwitch.Name

Write-Verbose -Message ‘Compiling virtual switch and management OS virtual NIC information’

Write-Progress -Activity (‘Loading information from virtual switch “{0}”‘ -f $VSwitch.Name) -Status ‘Storing vSwitch data’ -PercentComplete 75 -ParentId 1

$OutNull = $SwitchRebuildData.Add([SwitchDataPack]::new($VSwitch, $Team, ($HostVNICs.ForEach({ [NetAdapterDataPack]::new($_) }))))

Write-Progress -Activity (‘Loading information from virtual switch “{0}”‘ -f $VSwitch.Name) -Completed

}

Write-Progress -Activity ‘Pre-flight’ -Status ‘Cleaning up’ -PercentComplete 99 -ParentId 1

Write-Verbose -Message ‘Clearing loop variables’

$VSwitch = $Team = $TeamAdapter = $HostVNICs = $null

Write-Progress -Activity ‘Pre-flight’ -Completed

if($SwitchRebuildData.Count -eq 0)

{

Write-Warning -Message ‘No eligible virtual switches found.’

exit 1

}

$SwitchMark = 0

$SwitchCounter = 1

$SwitchStep = 1 / $SwitchRebuildData.Count * 100

$ClusterNodeRunning = $IsClustered

foreach ($OldSwitchData in $SwitchRebuildData)

{

$SwitchName = $OldSwitchData.Name

if($NewName.Count -gt 0)

{

$SwitchName = $NewName[($SwitchCounter – 1)]

}

Write-Progress -Activity ‘Rebuilding switches’ -Status (‘Processing virtual switch {0} ({1}/{2})’ -f $SwitchName, $SwitchCounter, $SwitchRebuildData.Count) -PercentComplete $SwitchMark -Id 1

$SwitchCounter++

$SwitchMark += $SwitchStep

$ShouldProcessTargetText = ‘Virtual switch {0}’ -f $OldSwitchData.Name

$ShouldProcessOperation = ‘Disconnect all virtual adapters, remove team and switch, build switch-embedded team, replace management OS vNICs, reconnect virtual adapters’

if ($Force -or $PSCmdlet.ShouldProcess($ShouldProcessTargetText , $ShouldProcessOperation))

{

if($ClusterNodeRunning)

{

Write-Verbose -Message ‘Draining cluster node’

Write-Progress -Activity ‘Draining cluster node’ -Status ‘Draining’

$OutNull = Invoke-CimMethod -InputObject $ClusterNode -MethodName ‘Pause’ -Arguments @{DrainType=2;TargetNode=}

while($ClusterNodeRunning)

{

Start-Sleep -Seconds 1

$ClusterNode = Get-CimInstance -InputObject $ClusterNode

switch($ClusterNode.NodeDrainStatus)

{

0 { Write-Error -Message ‘Failed to initiate cluster node drain’ }

2 { $ClusterNodeRunning = $false }

3 { Write-Error -Message ‘Failed to drain cluster roles’ }

# 1 is all that’s left, will cause loop to continue

}

}

}

Write-Progress -Activity ‘Draining cluster node’ -Completed

$SwitchProgressParams = @{Activity = (‘Processing switch {0}’ -f $OldSwitchData.Name); ParentId = 1; Id=2 }

Write-Verbose -Message ‘Disconnecting virtual machine adapters’

Write-Progress @SwitchProgressParams -Status ‘Disconnecting virtual machine adapters’ -PercentComplete 10

Write-Verbose -Message ‘Loading VM adapters connected to this switch’

$GuestVNICs = Get-VMNetworkAdapter -VMName * | Where-Object -Property SwitchName -EQ $OldSwitchData.Name

if($GuestVNICs)

{

Disconnect-VMNetworkAdapter -VMNetworkAdapter $GuestVNICs

}

Start-Sleep -Milliseconds 250 # seems to prefer a bit of rest time between removal commands

if($OldSwitchData.HostVNICs)

{

Write-Verbose -Message ‘Removing management vNICs’

Write-Progress @SwitchProgressParams -Status ‘Removing management vNICs’ -PercentComplete 20

Remove-VMNetworkAdapter -ManagementOS

}

Start-Sleep -Milliseconds 250 # seems to prefer a bit of rest time between removal commands

Write-Verbose -Message ‘Removing virtual switch’

Write-Progress @SwitchProgressParams -Status ‘Removing virtual switch’ -PercentComplete 30

Remove-VMSwitch -Name $OldSwitchData.Name -Force

Start-Sleep -Milliseconds 250 # seems to prefer a bit of rest time between removal commands

Write-Verbose -Message ‘Removing team’

Write-Progress @SwitchProgressParams -Status ‘Removing team’ -PercentComplete 40

Remove-NetLbfoTeam -Name $OldSwitchData.TeamName -Confirm:$false

Start-Sleep -Milliseconds 250 # seems to prefer a bit of rest time between removal commands

Write-Verbose -Message ‘Creating SET’

Write-Progress @SwitchProgressParams -Status ‘Creating SET’ -PercentComplete 50

$SetLoadBalancingAlgorithm = $null

if (-not $UseDefaults)

{

if ($OldSwitchData.LoadBalancingAlgorithm -eq 5)

{

$SetLoadBalancingAlgorithm = [Microsoft.HyperV.PowerShell.VMSwitchLoadBalancingAlgorithm]::Dynamic # 5 is dynamic; https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ndisimplatcimprov/msft-netlbfoteam

}

else # SET does not have LBFO’s hash options for load-balancing; assume that the original switch used a non-Dynamic mode for a reason

{

$SetLoadBalancingAlgorithm = [Microsoft.HyperV.PowerShell.VMSwitchLoadBalancingAlgorithm]::HyperVPort

}

}

if ($LoadBalancingAlgorithm)

{

$SetLoadBalancingAlgorithm = $LoadBalancingAlgorithm

}

$NewMinimumBandwidthMode = $null

if(-not $UseDefaults)

{

$NewMinimumBandwidthMode = $OldSwitchData.BandwidthReservationMode

}

if ($MinimumBandwidthMode)

{

$NewMinimumBandwidthMode = $MinimumBandwidthMode

}

$NewSwitchParams = @{NetAdapterName=$OldSwitchData.TeamMembers}

if($NewMinimumBandwidthMode)

{

$NewSwitchParams.Add(‘MinimumBandwidthMode’, $NewMinimumBandwidthMode)

}

try

{

$NewSwitch = New-VMSwitch @NewSwitchParams -Name $SwitchName -AllowManagementOS $false -EnableEmbeddedTeaming $true -Notes $Notes

}

catch

{

Write-Error -Message (‘Unable to create virtual switch {0}: {1}’ -f $SwitchName, $_.Exception.Message) -ErrorAction Continue

continue

}

if($SetLoadBalancingAlgorithm)

{

Write-Verbose -Message (‘Setting load balancing mode to {0}’ -f $SetLoadBalancingAlgorithm)

Write-Progress @SwitchProgressParams -Status ‘Setting SET load balancing algorithm’ -PercentComplete 60

Set-VMSwitchTeam -Name $NewSwitch.Name -LoadBalancingAlgorithm $SetLoadBalancingAlgorithm

}

$VNICCounter = 0

foreach($VNIC in $OldSwitchData.HostVNICs)

{

$VNICCounter++

Write-Progress @SwitchProgressParams -Status (‘Configuring management OS vNIC {0}/{1}’ -f $VNICCounter, $OldSwitchData.HostVNICs.Count) -PercentComplete 70

$VNICProgressParams = @{Activity = (‘Processing VNIC {0}’ -f $VNIC.Name); ParentId = 2; Id=3 }

Write-Verbose -Message (‘Adding virtual adapter “{0}” to switch “{1}”‘ -f $VNIC.Name, $NewSwitch.Name)

Write-Progress @VNICProgressParams -Status ‘Adding vNIC’ -PercentComplete 10

$NewNic = Add-VMNetworkAdapter -SwitchName $NewSwitch.Name -ManagementOS -Name $VNIC.Name -StaticMacAddress $VNIC.MacAddress -Passthru

$SetNicParams = @{ }

if ((-not $UseDefaults) -and $VNIC.MinimumBandwidthAbsolute -and $NewSwitch.BandwidthReservationMode -eq [Microsoft.HyperV.PowerShell.VMSwitchBandwidthMode]::Absolute)

{

$SetNicParams.Add(‘MinimumBandwidthAbsolute’, $VNIC.MinimumBandwidthAbsolute)

}

elseif ((-not $UseDefaults) -and $VNIC.MinimumBandwidthWeight -and $NewSwitch.BandwidthReservationMode -eq [Microsoft.HyperV.PowerShell.VMSwitchBandwidthMode]::Weight)

{

$SetNicParams.Add(‘MinimumBandwidthWeight’, $VNIC.MinimumBandwidthWeight)

}

if ($VNIC.MaximumBandwidth)

{

$SetNicParams.Add(‘MaximumBandwidth’, $VNIC.MaximumBandwidth)

}

Write-Verbose -Message (‘Setting properties on virtual adapter “{0}” on switch “{1}”‘ -f $VNIC.Name, $NewSwitch.Name)

Write-Progress @VNICProgressParams -Status ‘Setting vNIC parameters’ -PercentComplete 20

Set-VMNetworkAdapter -VMNetworkAdapter $NewNic @SetNicParams -ErrorAction Continue

if($VNIC.VlanId)

{

Write-Progress @VNICProgressParams -Status ‘Setting VLAN ID’ -PercentComplete 30

Write-Verbose -Message (‘Setting VLAN ID on virtual adapter “{0}” on switch “{1}”‘ -f $VNIC.Name, $NewSwitch.Name)

Set-VMNetworkAdapterVlan -VMNetworkAdapter $NewNic -Access -VlanId $VNIC.VlanId

}

$NewNicConfig = Get-CimAdapterConfigFromVirtualAdapter -VNIC $NewNic

if ($VNIC.IPAddresses.Count -gt 0)

{

Write-Progress @VNICProgressParams -Status ‘Setting IP and subnet masks’ -PercentComplete 40

foreach($IPAddressData in $VNIC.IPAddresses)

{

Write-Verbose -Message (‘Setting IP address {0}’ -f $IPAddressData.IPAddress)

$OutNull = New-NetIPAddress -InterfaceIndex $NewNicConfig.InterfaceIndex -IPAddress $IPAddressData.IPAddress -PrefixLength $IPAddressData.PrefixLength -SkipAsSource $IPAddressData.SkipAsSource -ErrorAction Continue

}

Write-Progress @VNICProgressParams -Status ‘Setting DNS registration behavior’ -PercentComplete 41

Set-CimAdapterProperty -InputObject $NewNicConfig -MethodName ‘SetDynamicDNSRegistration’ `

-Arguments @{ FullDNSRegistrationEnabled = $VNIC.NetAdapterConfiguration.FullDNSRegistrationEnabled; DomainDNSRegistrationEnabled = $VNIC.NetAdapterConfiguration.DomainDNSRegistrationEnabled } `

-Activity (‘Setting DNS registration behavior (dynamic registration: {0}, with domain name: {1}) on {2}’ -f $VNIC.NetAdapterConfiguration.FullDNSRegistrationEnabled, $VNIC.NetAdapterConfiguration.DomainDNSRegistrationEnabled, $NewNic.Name) `

-Url ‘https://docs.microsoft.com/en-us/windows/win32/cimwin32prov/setdynamicdnsregistration-method-in-class-win32-networkadapterconfiguration’

foreach($GatewayData in $VNIC.Gateways)

{

Write-Verbose -Message (‘Setting gateway address {0}’ -f $GatewayData.NextHop)

$OutNull = New-NetRoute -InterfaceIndex $NewNicConfig.InterfaceIndex -DestinationPrefix $GatewayData.DestinationPrefix -NextHop $GatewayData.NextHop -RouteMetric $GatewayData.RouteMetric

}

Write-Progress @VNICProgressParams -Status ‘Setting gateways’ -PercentComplete 42

if ($VNIC.NetAdapterConfiguration.DefaultIPGateway)

{

Set-CimAdapterProperty -InputObject $NewNicConfig -MethodName ‘SetGateways’ `

-Arguments @{ DefaultIPGateway = $VNIC.NetAdapterConfiguration.DefaultIPGateway } `

-Activity (‘Setting gateways {0} on {1}’  -f $VNIC.NetAdapterConfiguration.DefaultIPGateway, $NewNic.Name) `

-Url ‘https://docs.microsoft.com/en-us/windows/win32/cimwin32prov/setgateways-method-in-class-win32-networkadapterconfiguration’

}

if($VNIC.NetAdapterConfiguration.DNSDomain)

{

Write-Progress @VNICProgressParams -Status ‘Setting DNS domain’ -PercentComplete 43

Set-CimAdapterProperty -InputObject $NewNicConfig -MethodName ‘SetDNSDomain’ `

-Arguments @{ DNSDomain = $VNIC.NetAdapterConfiguration.DNSDomain } `

-Activity (‘Setting DNS domain {0} on {1}’ -f $VNIC.NetAdapterConfiguration.DNSDomain, $NewNic.Name) `

-Url ‘https://docs.microsoft.com/en-us/windows/win32/cimwin32prov/setdnsdomain-method-in-class-win32-networkadapterconfiguration’

}

if ($VNIC.NetAdapterConfiguration.DNSServerSearchOrder)

{

Write-Progress @VNICProgressParams -Status ‘Setting DNS servers’ -PercentComplete 44

Set-CimAdapterProperty -InputObject $NewNicConfig -MethodName ‘SetDNSServerSearchOrder’ `

-Arguments @{ DNSServerSearchOrder = $VNIC.NetAdapterConfiguration.DNSServerSearchOrder } `

-Activity (‘setting DNS servers {0} on {1}’ -f [String]::Join(‘, ‘, $VNIC.NetAdapterConfiguration.DNSServerSearchOrder), $NewNic.Name) `

-Url ‘https://docs.microsoft.com/en-us/windows/win32/cimwin32prov/setdnsserversearchorder-method-in-class-win32-networkadapterconfiguration’

}

if($VNIC.NetAdapterConfiguration.WINSPrimaryServer)

{

Write-Progress @VNICProgressParams -Status ‘Setting WINS servers’ -PercentComplete 45

Set-CimAdapterProperty -InputObject $NewNicConfig -MethodName ‘SetWINSServer’ `

-Arguments @{ WINSPrimaryServer = $VNIC.NetAdapterConfiguration.WINSPrimaryServer; WINSSecondaryServer = $VNIC.NetAdapterConfiguration.WINSSecondaryServer }

-Activity (‘Setting WINS servers {0} on {1}’ -f ([String]::Join(‘, ‘, $VNIC.NetAdapterConfiguration.WINSPrimaryServer, $VNIC.NetAdapterConfiguration.WINSSecondaryServer)), $NewNic.Name) `

-Url ‘https://docs.microsoft.com/en-us/windows/win32/cimwin32prov/setwinsserver-method-in-class-win32-networkadapterconfiguration’

}

}

if($VNIC.NetAdapterConfiguration.TcpipNetbiosOptions) # defaults to 0

{

Write-Progress @VNICProgressParams -Status ‘Setting NetBIOS over TCP/IP behavior’ -PercentComplete 50

Set-CimAdapterProperty -InputObject $NewNicConfig -MethodName ‘SetTcpipNetbios’ `

-Arguments @{ TcpipNetbiosOptions = $VNIC.NetAdapterConfiguration.TcpipNetbiosOptions } `

-Activity (‘Setting NetBIOS over TCP/IP behavior on {0} to {1}’ -f $NewNic.Name, $VNIC.NetAdapterConfiguration.TcpipNetbiosOptions) `

-Url ‘https://docs.microsoft.com/en-us/windows/win32/cimwin32prov/settcpipnetbios-method-in-class-win32-networkadapterconfiguration’

}

Write-Progress @VNICProgressParams -Status ‘Applying advanced properties’ -PercentComplete 60

$NewNicAdvancedProperties = Get-AdvancedSettingsFromAdapterConfig -AdapterConfig $NewNicConfig

$PropertiesCounter = 0

$PropertyProgressParams = @{Activity = ‘Processing VNIC advanced properties’; ParentId = 3; Id=4 }

foreach($SourceAdvancedProperty in $VNIC.AdvancedProperties)

{

foreach($NewNicAdvancedProperty in $NewNicAdvancedProperties)

{

if($SourceAdvancedProperty.ElementName -eq $NewNicAdvancedProperty.ElementName)

{

$PropertiesCounter++

Write-Progress @PropertyProgressParams -PercentComplete ($PropertiesCounter / $VNIC.AdvancedProperties.Count * 100) -Status (‘Applying property {0}’ -f $SourceAdvancedProperty.DisplayName)

Write-Verbose (‘Setting advanced property {0} to {1} on {2}’ -f $SourceAdvancedProperty.DisplayName, $SourceAdvancedProperty.DisplayValue, $VNIC.Name)

$NewNicAdvancedProperty.RegistryValue = $SourceAdvancedProperty.RegistryValue

Set-CimInstance -InputObject $NewNicAdvancedProperty -ErrorAction Continue

}

}

}

}

Write-Progress @VNICProgressParams -Completed

Write-Progress @SwitchProgressParams -Status ‘Reconnecting guest vNICs’ -PercentComplete 80

if($GuestVNICs)

{

foreach ($GuestVNIC in $GuestVNICs)

{

try

{

Connect-VMNetworkAdapter -VMNetworkAdapter $GuestVNIC -VMSwitch $NewSwitch

}

catch

{

Write-Error -Message (‘Failed to connect virtual adapter “{0}” with MAC address “{1}” to virtual switch “{2}”: {3}’ -f $GuestVNIC.Name, $GuestVNIC.MacAddress, $NewSwitch.Name, $_.Exception.Message) -ErrorAction Continue

}

}

}

Write-Progress @SwitchProgressParams -Completed

}

}

if($IsClustered)

{

Write-Verbose -Message ‘Resuming cluster node’

$OutNull = Invoke-CimMethod -InputObject $ClusterNode -MethodName ‘Resume’ -Arguments @{FailbackType=1}

}

}

Go to Original Article
Author: Eric Siron

How to Use Failover Clusters with 3rd Party Replication

In this second post, we will review the different types of replication options and give you guidance on what you need to ask your storage vendor if you are considering a third-party storage replication solution.

If you want to set up a resilient disaster recovery (DR) solution for Windows Server and Hyper-V, you’ll need to understand how to configure a multi-site cluster as this also provides you with local high-availability. In the first post in this series, you learned about the best practices for planning the location, node count, quorum configuration and hardware setup. The next critical decision you have to make is how to maintain identical copies of your data at both sites, so that the same information is available to your applications, VMs, and users.

Multi-Site Cluster Storage Planning

All Windows Server Failover Clusters require some type of shared storage to allow an application to run on any host and access the same data. Multi-site clusters behave the same way, but they require multiple independent storage arrays at each site, with the data replicated between them. The data for the clustered application or virtual machine (VM) on each site should use its own local storage array, or it could have significant latency if each disk IO operation had to go to the other location.

If you are running Hyper-V VMs on your multi-site cluster, you may wish to use Cluster Shared Volumes (CSV) disks. This type of clustered storage configuration is optimized for Hyper-V and allows multiple virtual hard disks (VHDs) to reside on the same disk while allowing the VMs to run on different nodes. The challenge when using CSV in a multi-site cluster is that the VMs must make sure that they are always writing to their disk in their site, and not the replicated copy. Most storage providers offer CSV-aware solutions, and you must make sure that they explicitly support multi-site clustering scenarios. Often the vendors will force writes at the primary site by making the CSV disk at the second site read-only, to ensure that the correct disks are always being used.

Understanding Synchronous and Asynchronous Replication

As you progress in planning your multi-site cluster you will have to select how your data is copied between sites, either synchronously or asynchronously. With asynchronous replication, the application will write to the clustered disk at the primary site, then at regular intervals, the changes will be copied to the disk at the secondary site. This usually happens every few minutes or hours, but if a site fails between replication cycles, then any data from the primary site which has not yet been copied to the secondary site will be lost. This is the recommended configuration for applications that can sustain some amount of data loss, and this generally does not impose any restrictions on the distance between sites. The following image shows the asynchronous replication cycle.

Asynchronous Replication in a Multi-Site Cluster

Asynchronous Replication in a Multi-Site Cluster

With synchronous replication, whenever a disk write command occurs on the primary site, it is then copied to the secondary site, and an acknowledgment is returned to both the primary and secondary storage arrays before that write is committed. Synchronous replication ensures consistency between both sites and avoids data loss in the event that there is a crash between a replication cycle. The challenge of writing to two sets of disks in different locations is that the physical distance between sites must be close or it can affect the performance of the application. Even with a high-bandwidth and low-latency connection, synchronous replication is usually recommended only for critical applications that cannot sustain any data loss, and this should be considered with the location of your secondary site.  The following image shows the asynchronous replication cycle.

Synchronous Replication in a Multi-Site Cluster

Synchronous Replication in a Multi-Site Cluster

As you continue to evaluate different storage vendors, you may also want to assess the granularity of their replication solution. Most of the traditional storage vendors will replicate data at the block-level, which means that they track specific segments of data on the disk which have changed since the last replication. This is usually fast and works well with larger files (like virtual hard disks or databases), as only blocks that have changed need to be copied to the secondary site. Some examples of integrated block-level solutions include HP’s Cluster Extension, Dell/EMC’s Cluster Enabler (SRDF/CE for DMX, RecoverPoint for CLARiiON), Hitachi’s Storage Cluster (HSC), NetApp’s MetroCluster, and IBM’s Storage System.

There are also some storage vendors which provide a file-based replication solution that can run on top of commodity storage hardware. These providers will keep track of individual files which have changed, and only copy those. They are often less efficient than the block-level replication solutions as larger chunks of data (full files) must be copied, however, the total cost of ownership can be much less. A few of the top file-level vendors who support multi-site clusters include Symantec’s Storage Foundation High Availability, Sanbolic’s Melio, SIOS’s Datakeeper Cluster Edition, and Vision Solutions’ Double-Take Availability.

The final class of replication providers will abstract the underlying sets of storage arrays at each site. This software manages disk access and redirection to the correct location. The more popular solutions include EMC’s VPLEX, FalconStor’s Continuous Data Protector and DataCore’s SANsymphony. Almost all of the block-level, file-level, and appliance-level providers are compatible with CSV disks, but it is best to check that they support the latest version of Windows Server if you are planning a fresh deployment.

By now you should have a good understanding of how you plan to configure your multi-site cluster and your replication requirements. Now you can plan your backup and recovery process. Even though the application’s data is being copied to the secondary site, which is similar to a backup, it does not replace the real thing. This is because if the VM (VHD) on one site becomes corrupted, that same error is likely going to be copied to the secondary site. You should still regularly back up any production workloads running at either site.  This means that you need to deploy your cluster-aware backup software and agents in both locations and ensure that they are regularly taking backups. The backups should also be stored independently at both sites so that they can be recovered from either location if one datacenter becomes unavailable. Testing recovery from both sites is strongly recommended. Altaro’s Hyper-V Backup is a great solution for multi-site clusters and is CSV-aware, ensuring that your disaster recovery solution is resilient to all types of disasters.

If you are looking for a more affordable multi-site cluster replication solution, only have a single datacenter, or your storage provider does not support these scenarios, Microsoft offers a few solutions. This includes Hyper-V Replica and Azure Site Recovery, and we’ll explore these disaster recovery options and how they integrate with Windows Server Failover Clustering in the third part of this blog series.

Let us know if you have any questions in the comments form below!


Go to Original Article
Author: Symon Perriman