Tag Archives: HyperV

Hyper-V Quick Tip: Safely Shutdown a Guest with Unresponsive Mouse

Q: A virtual machine running Windows under Hyper-V does not respond to mouse commands in VMConnect. How do I safely shut it down?

A: Use a combination of VMConnect’s key actions and native Windows key sequences to shut down.

Ordinarily, you would use one of Hyper-V’s various “Shut Down” commands to instruct a virtual machine to gracefully shut down the guest operating system. Otherwise, you can use the guest operating system’s native techniques for shutting down. In Windows guests running the full desktop experience, the mouse provides the easiest way. However, any failure of the guest operating system renders the mouse inoperable. The keyboard continues to work, of course.

Shutting Down a Windows Guest of Hyper-V Using the Keyboard

Your basic goal is to reach a place where you can issue the shutdown command.

Tip: Avoid using the mouse on the VMConnect window at all. It will bring up the prompt about the mouse each time unless you disable it. Clicking on VMConnect’s title bar will automatically set focus so that it will send most keypresses into the guest operating system. You cannot send system key combinations or anything involving the physical Windows key (otherwise, these directions would be a lot shorter).

  1. First, you need to log in. Windows 10/Windows Server 2016 and later no longer require any particular key sequence to bring up a login prompt — pressing any key while VMConnect has focus should show a login prompt. Windows 8.1/Windows Server 2012 R2 and earlier all require a CTRL+ALT+DEL sequence prior to making log in available. For those, click Action on VMConnect’s menu bar, then click Ctrl+Alt+Delete. If your VMConnect session is running locally, you can press the CTRL+ALT+END sequence on your physical keyboard instead. However, that won’t work within a remote desktop session.

    You can also press the related button on VMConnect’s button bar immediately below the text menu. It’s the button with three small boxes. In the screenshot above, look directly to the left of the highlighted text.
  2. Log in with valid credentials. Your virtual machine’s network likely does not work either, so you may need to use local credentials.
  3. Use the same sequences from step 1 to send a CTRL+ALT+DEL sequence to the guest.
  4. In the overlay that appears, use the physical down or up arrow key until Task Manager is selected, then press Enter. The screen will look different on versions prior to 10/2016 but will function the same.
  5. Task Manager should appear as the top-most window. If it does, proceed to step 6.
    If it does not, then you might be out of luck. If you can see enough of Task Manager to identify the window that obscures it, or if you’re just plain lucky, you can close the offending program. If you want, you can just proceed to step 6 and try to run these steps blindly.
    1. Press the TAB key. That will cause Task Manager to switch focus to its processes list.
    2. Press the up or down physical arrow keys to cycle through the running processes.
    3. Press Del to close a process.
  6. Press ALT+F to bring up Task Manager’s file menu. Press Enter or N for Run new task (wording is different on earlier versions of Windows).
  7. In the Create new task dialog, type 
    shutdown /s /t 0. If your display does not distinguish, that’s a zero at the end, not the letter O. Shutting down from within the console typically does not require administrative access, but if you’d like, you can press Tab to set focus to the Create this task with administrative privileges box and then press the Spacebar to check it. Press Enter to run the command (or Tab to the OK button and press Enter).

Once you’ve reached step 7, you have other options. You can enter cmd to bring up a command prompt or powershell for a PowerShell prompt. If you want to tinker with different options for the shutdown command, you can do that as well. If you would like to get into Device Manager to see if you can sort out whatever ails the integration services, run devmgmt.msc (use the administrative privileges checkbox for best results).

Be aware that this generally will not fix anything. Whatever prevented the integration services from running will likely continue. However, your guest won’t suffer any data loss. So, you could connect its VHDX file(s) to a healthy virtual machine for troubleshooting. Or, if the problem is environmental, you can safely relocate the virtual machine to another host.

More Hyper-V Quick Tips

How Many Cluster Networks Should I Use?

How to Choose a Live Migration Performance Solution

How to Enable Nested Virtualization

Have you run into this issue yourself? Were you able to navigate around it? What was your solution? Let us know in the comments section below!

Hyper-V Quick Tip: How to Enable Nested Virtualization

Q: How Do I Enable Nested Virtualization for Hyper-V Virtual Machines

A: Pass $true for Set-VMProcessor’s “ExposeVirtualizationExtensions” parameter

In its absolute simplest form:

Set-VMProcessor has several other parameters which you can view in its online help.

As shown above, the first parameter is positional, meaning that it guesses that I supplied a virtual machine’s name because it’s in the first slot and I didn’t tell it otherwise. For interactive work, that’s fine. In scripting, try to always fully-qualify every parameter so that you and other maintainers don’t need to guess:

The Set-VMProcessor cmdlet also accepts pipeline input. Therefore, you can do things like:

Requirements for Nested Virtualization

In order for nested virtualization to work, you must meet all of the following:

  • The Hyper-V host must be at least the Anniversary Edition version of Windows 10, Windows Server 2016, Hyper-V Server 2016, or Windows Server Semi-Annual Channel
  • The Hyper-V host must be using Intel CPUs. AMD is not yet supported
  • A virtual machine must be off to have its processor extensions changed

No configuration changes are necessary for the host.

Microsoft only guarantees that you can run Hyper-V nested within Hyper-V. Other hypervisors may work, but you will not receive support either way. You may have mixed results trying to run different versions of Hyper-V. I am unaware of any support statement on this, but I’ve had problems running mismatched levels of major versions.

Memory Changes for Nested Virtual Machines

Be aware that a virtual machine with virtualization extensions exposed will always use its configured value for Startup memory. You cannot use Dynamic Memory, nor can you change the virtual machine’s fixed memory while it is running.

Remember, as always, I’m here to help, so send me any questions you have on this topic using the question form below and I’ll get back to you as soon as I can.

More Hyper-V Quick Tips from Eric:

Hyper-V Quick Tip: How to Choose a Live Migration Performance Solution

Hyper-V Quick Tip: How Many Cluster Networks Should I Use?

How to Create Automated Hyper-V Performance Reports

Wouldn’t it be nice to periodically get an automatic performance review of your Hyper-V VMs? Well, this blog post shows you how to do exactly that.

Hyper-V Performance Counters & Past Material

Over the last few weeks, I’ve been working with Hyper-V performance counters and PowerShell, developing new reporting tools. I thought I’d write about Hyper-V Performance counters here until I realized I already have.
https://www.altaro.com/hyper-v/performance-counters-hyper-v-and-powershell-part-1/
https://www.altaro.com/hyper-v/hyper-v-performance-counters-and-powershell-part-2/
Even though I wrote these articles several years ago, nothing has really changed. If you aren’t familiar with Hyper-V performance counters I encourage you to take a few minutes and read these. Otherwise, some of the material in this article might not make sense.

Get-CimInstance

Normally, using Get-Counter is a better approach, especially if you want to watch performance over a given timespan. But sometimes you just want a quick point in time snapshot. Or you may have network challenges. As far as I can tell Get-Counter uses legacy networking protocols, i.e. RPC and DCOM. This does not make them very firewall friendly. You could use PowerShell Remoting and Invoke-Command to run Get-Counter on the remote server. Or you can use Get-CimInstance which is what I want to cover in this article.

When you run Get-Counter, you are actually querying performance counter classes in WMI. This means you can get the same information using Get-CimInstance, or Get-WmiObject. But because we want to leverage WSMan and PowerShell Remoting, we’ll stick with the former.

Building a Hyper-V Performance Report

First, we need to identify the counter classes. I’ll focus on the classes that have “cooked” or formatted data.

I’m setting a variable for the computername so that you can easily re-use the code. I’m demonstrating this on a Windows 10 desktop running Hyper-V but you can just as easily point $Computer to a Hyper-V host.

It is pretty easy to leverage the PowerShell pipeline and create a report for all Hyper-V performance counters.

The text file will list each performance counter class followed by all instances of that class. If you run this code, you’ll see there are a number of properties that won’t have any values. It might help to filter those out. Here’s a snippet of code that is a variation on the text file. This code creates an HTML report, skipping properties that likely will have no value.

This code creates an HTML report using fragments. I also am dynamically deciding to create a table or a list based on the number of properties.

HTML Performance Counter Report

Thus far I’ve been creating reports for all performance counters and all instances. But you might only be interested in a single virtual machine. This is a situation where you can take advantage of WMI filtering.

In looking at the output from all classes, I can see that the Name property on these classes can include the virtual machine name as part of the value. So I will go through every class and filter only for instances that contain the name of VM I want to monitor.

This example also adds a footer to the report showing when it was created.

HTML Performance Report for a Single VM

It doesn’t take much more effort to create a report for each virtual machine. I turned my code into the beginning of a usable PowerShell function, designed to take pipeline input.

With this function, I can query as many virtual machines and create a performance report for each.

You can take this idea a step further and run this as a PowerShell scheduled job, perhaps saving the report files to an internal team web server.

I have at least one other intriguing PowerShell technique for working with Hyper-V performance counters, but I think I’ve given you enough to work with for today so I’ll save it until next time.

Wrap-Up

Did you find this useful? Have you done something similar and used a different method? Let us know in the comments section below!

Thanks for Reading!

Manage all your Hyper-V snapshots with PowerShell


It’s much easier to manage Hyper-V snapshots using PowerShell than a GUI because PowerShell offers the greater…

“;
}
});

/**
* remove unnecessary class from ul
*/
$(“#inlineregform”).find( “ul” ).removeClass(“default-list”);

/**
* Replace “errorMessageInput” class with “sign-up-error-msg” class
*/
function renameErrorMsgClass() {
$(“.errorMessageInput”).each(function() {
if ($(this).hasClass(“hidden”)) {
$(this).removeClass(“errorMessageInput hidden”).addClass(“sign-up-error-msg hidden”);
} else {
$(this).removeClass(“errorMessageInput”).addClass(“sign-up-error-msg”);
}
});
}

/**
* when validation function is called, replace “errorMessageInput” with “sign-up-error-msg”
* before return
*/
function validateThis(v, form) {
var validateReturn = urValidation.validate(v, form);
renameErrorMsgClass();
return validateReturn;
}

/**
* DoC pop-up window js – included in moScripts.js which is not included in responsive page
*/
$(“#inlineRegistration”).on(“click”,”a.consentWindow”, function(e) {
window.open(this.href, “Consent”, “width=500,height=600,scrollbars=1”);
e.preventDefault();
});

flexibility. Once you’re familiar with the basic commands, you’ll be equipped to oversee and change the state of the VMs in your virtual environment.

PowerShell not only reduces the time it takes to perform a task using a GUI tool, but it also reduces the time it takes to perform repeated tasks. For example, if you want to see the memory configured on all Hyper-V VMs, a quick PowerShell command or script is easier to execute than checking VMs one by one. Similarly, you can perform operations related to Hyper-V snapshots using PowerShell.

A snapshot — or checkpoint, depending on which version of Windows Server you have — is a point-in-time picture of a VM that you can use to restore that VM to the state it was in when the snapshot was taken. For example, if you face issues when updating Windows VMs and they don’t restart properly, you can restore VMs to the state they were in before you installed the updates.

Similarly, developers can use checkpoints to quickly perform application tests.

Before Windows Server 2012 R2, Microsoft didn’t support snapshots for production use. But starting with Windows Server 2012 R2, snapshots have been renamed checkpoints and are well-supported in a production environment.

PowerShell commands for Hyper-V snapshots and checkpoints

Microsoft offers a few PowerShell commands to work with Hyper-V checkpoints and snapshots, such as Checkpoint-VM, Get-VMSnapshot, Remove-VMSnapshot and Restore-VMSnapshot

If you want to retrieve all the Hyper-V snapshots associated with a particular VM, all you need to do is execute the Get-VMSnapshot -VMName PowerShell command. For example, the PowerShell command below lists all the snapshots associated with SQLVM:

Get-VMSnapshot -VMName SQLVM

There are two types of Hyper-V checkpoints available: standard and production checkpoints. If you just need all the production checkpoints for a VM, execute the PowerShell command below:

Get-VMSnapshot -VMName SQLVM -SnapshotType Production

To list only the standard checkpoints, execute the following PowerShell command:

Get-VMSnapshot -VMName SQLVM -SnapshotType Standard

When it comes to creating Hyper-V checkpoints for VMs, use the Checkpoint-VM PowerShell command. For example, to take a checkpoint for a particular VM, execute the command below:

Checkpoint-VM -Name TestVM -SnapshotName TestVMSnapshot1

The above command creates a checkpoint for TestVM on the local Hyper-V server, but you can use the following command to create a checkpoint for a VM located on a remote Hyper-V server:

Get-VM SQLVM -ComputerName HyperVServer | Checkpoint-VM

There are situations where you might want to create Hyper-V checkpoints of VMs in bulk. For example, before installing an update on production VMs or upgrading line-of-business applications in a VM, you might want to create checkpoints to ensure you can successfully restore VMs to ensure business continuity. But if you have several VMs, the checkpoint process might take a considerable amount of time.

You can design a small PowerShell script to take Hyper-V checkpoints for VMs specified in a text file, as shown in the PowerShell script below:

$ProdVMs = “C:TempProdVMs.TXT”
Foreach ($ThisVM in Get-Content $ProdVMs)
{   
$ChkName = $ThisVM+”_BeforeUpdates”
Checkpoint-VM -Name $ThisVM -SnapshotName $ChkName
}
Write-Host “Script finished creating Checkpoints for Virtual Machines.”

The above PowerShell script gets VM names from the C:TempProdVMs.TXT file one by one and then runs the Checkpoint-VM PowerShell command to create the checkpoints.

To remove Hyper-V snapshots from VMs, use the Remove-VMSnapshot PowerShell command. For example, to remove a snapshot called TestSnapshot from a VM, execute the following PowerShell command:

Get-VM SQLVM | Remove-VMSnapshot -Name TestSnapshot

To remove Hyper-V checkpoints from bulk VMs, use the same PowerShell script you used to create the checkpoints. Let’s assume all the VMs are working as expected after installing the updates and you would like to remove the checkpoints. Simply execute the PowerShell script below:

$ProdVMs = “C:TempProdVMs.TXT”
Foreach ($ThisVM in Get-Content $ProdVMs)
{   
$ChkName = $ThisVM+”_BeforeUpdates”
Get-VM $ThisVM | Remove-VMSnapshot $ChkName
}
Write-Host “Script finished removing Checkpoints for Virtual Machines.”

To restore Hyper-V snapshots for VMs, use the Restore-VMSnapshot PowerShell cmdlet. For example, to restore or apply a snapshot to a particular VM, use the following PowerShell command:

Restore-VMSnapshot -Name “TestSnapshot1” -VMName SQLVM -Confirm:$False

Let’s assume your production VMs aren’t starting up after installing the updates and you would like to restore the VMs to their previous states. Use the PowerShell script below and perform the restore operation:

$ProdVMs = “C:TempProdVMs.TXT”
Foreach ($ThisVM in Get-Content $ProdVMs)
{   
$ChkName = $ThisVM+”_BeforeUpdates”
Restore-VMSnapshot -Name $ChkName -VMName $ThisVM -Confirm:$False
}
Write-Host “Script finished Restoring Checkpoints for Virtual Machines.”

Note that, by default, when restoring a checkpoint for a VM, the command asks for confirmation. To avoid the confirmation prompt, add the Confirm:$False parameter to the command, as shown above.

Hyper-V Quick Tip: How to Choose a Live Migration Performance Solution

Q: Which Hyper-V Live Migration Performance Option should I choose?

A: If you have RDMA-capable physical adapters, choose SMB. Otherwise, choose Compression.

You’ve probably seen this page of your Hyper-V hosts’ Settings dialog box at some point:

live migration performance

The field descriptions go into some detail, but they don’t tell the entire story.

Live Migration Transport Option: TCP/IP

In this case, “TCP/IP” basically means: “don’t use compression or SMB”. Prior to 2012, this was the only mode available. A host opens up a channel to the target system on TCP port 6600 and shoots the data over as quickly as possible.

Live Migration Transport Option: Compression

Introduced in 2012, the Compression method mostly explains itself. The hosts still use port 6600, but the sender compresses the data prior to transmission. This technique has three things going for it:

  • The vast bulk of a Live Migration involves moving the virtual machine’s memory contents. Memory contents tend to compress quite readily, resulting in a substantially reduced payload size over the TCP/IP method
  • For most computing systems, the CPU cycles involved in compression are faster and cheaper than the computations needed to break down, transmit, and re-assemble multi-channel TCP/IP traffic
  • Works for any environment
  • Each virtual machine that you move simultaneously (limited by host settings) will get its own unique TCP channel. That gives the Dynamic and Hash load balancing algorithms an opportunity to use different physical pathways for simultaneous migrations

Live Migration Transport Option: SMB

Also new with 2012, the SMB transport method leverages the new capabilities of version 3 (and higher) of the SMB protocol. Two things matter for Live Migration:

  • SMB Direct: Leveraging RDMA-capable hardware, packets transmitted by SMB Direct move so quickly you’d almost think they arrived before they left. If you haven’t had a chance to see RDMA in action, you’re missing out. Unfortunately, you can’t get RDMA on the cheap.
  • SMB Multichannel: When multiple logical paths are available (as in, a single host with different IP addresses, preferably on different networks), SMB can break up traffic into multiple streams and utilize all available routes

Why Should I Favor Compression over SMB?

The SMB method sounds really good, right? Even if you can’t use SMB Direct, you get something from SMB multichannel, right? Well… no… not much. Processing of TCP/IP packets and Ethernet frames has always been intensive at scale. Ordinarily, our server computers don’t move much data so we don’t see it. However, a Live Migration pushes lots of data. Even keeping a single Ethernet stream intact and in order can cause a burden on your networking hardware. Breaking it up into multiple pieces and re-assembling everything in the correct order across multiple channels can pose a nightmare scenario. However, SMB Direct can offload enough of the basic network processing to nearly trivialize the effort. Without that aid, Compression will be faster for most people.

Should I Ever Prefer the Plain TCP/IP Method?

I have not personally encountered a scenario in which I would prefer TCP/IP over the other choices. However, it does cause the least amount of host load. If your hosts have very high normal CPU usage and you want Live Migrations to occur as discreetly as possible, choose TCP/IP. You may add in a QoS layer to tone it down further.

Your Comments

Don’t agree with my assessment or encountered situations which don’t line up with my advice? I’m happy to hear your thoughts on Live Migrations Performance Options and which are the correct solutions to choose in various circumstances. Write to me in the comments below.

If you like this Hyper-V Quick tip, check out another in the Hyper-V Quick Tip Series.

How to Architect and Implement Networks for a Hyper-V Cluster

We recently published a quick tip article recommending the number of networks you should use in a cluster of Hyper-V hosts. I want to expand on that content to make it clear why we’ve changed practice from pre-2012 versions and how we arrive at this guidance. Use the previous post for quick guidance; read this one to learn the supporting concepts. These ideas apply to all versions from 2012 onward.

Why Did We Abandon Practices from 2008 R2?

If you dig on TechNet a bit, you can find an article outlining how to architect networks for a 2008 R2 Hyper-V cluster. While it was perfect for its time, we have new technologies that make its advice obsolete. I have two reasons for bringing it up:

  • Some people still follow those guidelines on new builds — worse, they recommend it to others
  • Even though we no longer follow that implementation practice, we still need to solve the same fundamental problems

We changed practices because we gained new tools to address our cluster networking problems.

What Do Cluster Networks Need to Accomplish for Hyper-V?

Our root problem has never changed: we need to ensure that we always have enough available bandwidth to prevent choking out any of our services or inter-node traffic. In 2008 R2, we could only do that by using multiple physical network adapters and designating traffic types to individual pathways. Note: It was possible to use third-party teaming software to overcome some of that challenge, but that was never supported and introduced other problems.

Starting from our basic problem, we next need to determine how to delineate those various traffic types. That original article did some of that work. We can immediately identify what appears to be four types of traffic:

  • Management (communications with hosts outside the cluster, ex: inbound RDP connections)
  • Standard inter-node cluster communications (ex: heartbeat, cluster resource status updates)
  • Cluster Shared Volume traffic
  • Live Migration

However, it turns out that some clumsy wording caused confusion. Cluster communication traffic and Cluster Shared Volume traffic are exactly the same thing. That reduces our needs to three types of cluster traffic.

What About Virtual Machine Traffic?

You might have noticed that I didn’t say anything about virtual machine traffic above. Same would be true if you were working up a different kind of cluster, such as SQL. I certainly understand the importance of that traffic; in my mind, service traffic prioritizes above all cluster traffic. Understand one thing: service traffic for external clients is not clustered. So, your cluster of Hyper-V nodes might provide high availability services for virtual machine vmabc, but all of vmabc‘s network traffic will only use its owning node’s physical network resources. So, you will not architect any cluster networks to process virtual machine traffic.

As for preventing cluster traffic from squelching virtual machine traffic, we’ll revisit that in an upcoming section.

Fundamental Terminology and Concepts

These discussions often go awry over a misunderstanding of basic concepts.

  • Cluster Name Object: A Microsoft Failover Cluster has its own identity separate from its member nodes known as a Cluster Name Object (CNO). The CNO uses a computer name, appears in Active Directory, has an IP, and registers in DNS. Some clusters, such as SQL, may use multiple CNOs. A CNO must have an IP address on a cluster network.
  • Cluster Network: A Microsoft Failover Cluster scans its nodes and automatically creates “cluster networks” based on the discovered physical and IP topology. Each cluster network constitutes a discrete communications pathway between cluster nodes.
  • Management network: A cluster network that allows inbound traffic meant for the member host nodes and typically used as their default outbound network to communicate with any system outside the cluster (e.g. RDP connections, backup, Windows Update). The management network hosts the cluster’s primary cluster name object. Typically, you would not expose any externally-accessible services via the management network.
  • Access Point (or Cluster Access Point): The IP address that belongs to a CNO.
  • Roles: The name used by Failover Cluster Management for the entities it protects (e.g. a virtual machine, a SQL instance). I generally refer to them as services.
  • Partitioned: A status that the cluster will give to any network on which one or more nodes does not have a presence or cannot be reached.
  • SMB: ALL communications native to failover clustering use Microsoft’s Server Message Block (SMB) protocol. With the introduction of version 3 in Windows Server 2012, that now includes innate multi-channel capabilities (and more!)

Are Microsoft Failover Clusters Active/Active or Active/Passive?

Microsoft Failover Clusters are active/passive. Every node can run services at the same time as the other nodes, but no single service can be hosted by multiple nodes. In this usage, “service” does not mean those items that you see in the Services Control Panel applet. It refers to what the cluster calls “roles” (see above). Only one node will ever host any given role or CNO at any given time.

How Does Microsoft Failover Clustering Identify a Network?

The cluster decides what constitutes a network; your build guides it, but you do not have any direct input. Any time the cluster’s network topology changes, the cluster service re-evaluates.

First, the cluster scans a node for logical network adapters that have IP addresses. That might be a physical network adapter, a team’s logical adapter, or a Hyper-V virtual network adapter assigned to the management operating system. It does not see any virtual NICs assigned to virtual machines.

For each discovered adapter and IP combination on that node, it builds a list of networks from the subnet masks. For instance, if it finds an adapter with an IP of 192.168.10.20 and a subnet mask of 255.255.255.0, then it creates a 192.168.10.0/24 network.

The cluster then continues through all of the other nodes, following the same process.

Be aware that every node does not need to have a presence in a given network in order for failover clustering to identify it; however, the cluster will mark such networks as partitioned.

What Happens if a Single Adapter has Multiple IPs?

If you assign multiple IPs to the same adapter, one of two things will happen. Which of the two depends on whether or not the secondary IP shares a subnet with the primary.

When an Adapter Hosts Multiple IPs in Different Networks

The cluster identifies networks by adapter first. Therefore, if an adapter has multiple IPs, the cluster will lump them all into the same network. If another adapter on a different host has an IP in one of the networks but not all of the networks, then the cluster will simply use whichever IPs can communicate.

As an example, see the following network:

The second node has two IPs on the same adapter and the cluster has added it to the existing network. You can use this to re-IP a network with minimal disruption.

A natural question: what happens if you spread IPs for the same subnet across different existing networks? I tested it a bit and the cluster allowed it and did not bring the networks down. However, it always had the functional IP pathway to use, so that doesn’t tell us much. Had I removed the functional pathways, then it would have collapsed the remaining IPs into an all-new network and it would have worked just fine. I recommend keeping an eye on your IP scheme and not allowing things like that in the first place.

When an Adapter Hosts Multiple IPs in the Same Network

The cluster will pick a single IP in the same subnet to represent the host in that network.

What if Different Adapters on the Same Host have an IP in the Same Subnet?

The same outcome occurs as if the IPs were on the same adapter: the cluster picks one to represent the cluster and ignores the rest.

The Management Network

All clusters (Hyper-V, SQL, SOFS, etc.) require a network that we commonly dub Management. That network contains the CNO that represents the cluster as a singular system. The management network has little importance for Hyper-V, but external tools connect to the cluster using that network. By necessity, the cluster nodes use IPs on that network for their own communications.

The management network will also carry cluster-specific traffic. More on that later.

Note: Replica uses a management network.

Cluster Communications Networks (Including Cluster Shared Volume Traffic)

A cluster communications network will carry:

  • Cluster heartbeat information. Each node must hear from every other node within a specific amount of time (1 second by default). If it does not hear from a minimum of nodes to maintain quorum, then it will begin failover procedures. Failover is more complicated than that, but beyond the scope of this article.
  • Cluster configuration changes. If any configuration item changes, whether to the cluster’s own configuration or the configuration or status of a protected service, the node that processes the change will immediately transmit to all of the other nodes so that they can update their own local information store.
  • Cluster Shared Volume traffic. When all is well, this network will only carry metadata information. Basically, when anything changes on a CSV that updates its volume information table, that update needs to be duplicated to all of the other nodes. If the change occurs on the owning node, less data needs to be transmitted, but it will never be perfectly quiet. So, this network can be quite chatty, but will typically use very little bandwidth. However, if one or more nodes lose direct connectivity to the storage that hosts a CSV, all of its I/O will route across a cluster network. Network saturation will then depend on the amount of I/O the disconnected node(s) need(s).

Live Migration Networks

That heading is a bit of misnomer. The cluster does not have its own concept of a Live Migration network per se. Instead, you let the cluster know which networks you will permit to carry Live Migration traffic. You can independently choose whether or not those networks can carry other traffic.

Other Identified Networks

The cluster may identify networks that we don’t want to participate in any kind of cluster communications at all. iSCSI serves as the most common example. We’ll learn how to deal with those.

Architectural Goals

Now we know our traffic types. Next, we need to architect our cluster networks to handle them appropriately. Let’s begin by understanding why you shouldn’t take the easy route of using a singular network. A minimally functional Hyper-V cluster only requires that “management” network. Stopping there leaves you vulnerable to three problems:

  • The cluster will be unable to select another IP network for different communication types. As an example, Live Migration could choke out the normal cluster hearbeat, causing nodes to consider themselves isolated and shut down
  • The cluster and its hosts will be unable to perform efficient traffic balancing, even when you utilize teams
  • IP-based problems in that network (even external to the cluster) could cause a complete cluster failure

Therefore, you want to create at least one other network. In the pre-2012 model we could designate specific adapters to carry specific traffic types. In the 2012 and later model, we simply create at least one more additional network to allow cluster communications but not client access. Some benefits:

  • Clusters of version 2012 or new will automatically employ SMB multichannel. Inter-node traffic (including Cluster Shared Volume data) will balance itself without further configuration work.
  • The cluster can bypass trouble on one IP network by choosing another; you can help by disabling a network in Failover Cluster Manager
  • Better load balancing across alternative physical pathways

The Second Supporting Network… and Beyond

Creating networks beyond the initial two can add further value:

  • If desired, you can specify networks for Live Migration traffic, and even exclude those from normal cluster communications. Note: For modern deployments, doing so typically yields little value
  • If you host your cluster networks on a team, matching the number of cluster networks to physical adapters allows the teaming and multichannel mechanisms the greatest opportunity to fully balance transmissions. Note: You cannot guarantee a perfectly smooth balance

Architecting Hyper-V Cluster Networks

Now we know what we need and have a nebulous idea of how that might be accomplished. Let’s get into some real implementation. Start off by reviewing your implementation choices. You have three options for hosting a cluster network:

  • One physical adapter or team of adapters per cluster network
  • Convergence of one or more cluster networks onto one or more physical teams or adapters
  • Convergence of one or more cluster networks onto one or more physical teams claimed by a Hyper-V virtual switch

A few pointers to help you decide:

  • For modern deployments, avoid using one adapter or team for a cluster network. It makes poor use of available network resources by forcing an unnecessary segregation of traffic.
  • I personally do not recommend bare teams for Hyper-V cluster communications. You would need to exclude such networks from participating in a Hyper-V switch, which would also force an unnecessary segregation of traffic.
  • The most even and simple distribution involves a singular team with a Hyper-V switch that hosts all cluster network adapters and virtual machine adapters. Start there and break away only as necessary.
  • A single 10 gigabit adapter swamps multiple gigabit adapters. If your hosts have both, don’t even bother with the gigabit.

To simplify your architecture, decide early:

  • How many networks you will use. They do not need to have different functions. For example, the old management/cluster/Live Migration/storage breakdown no longer makes sense. One management and three cluster networks for a four-member team does make sense.
  • The IP structure for each network. For networks that will only carry cluster (including intra-cluster Live Migration) communication, the chosen subnet(s) do not need to exist in your current infrastructureAs long as each adapter in a cluster network can reach all of the others at layer 2 (Ethernet), then you can invent any IP network that you want.

I recommend that you start off expecting to use a completely converged design that uses all physical network adapters in a single team. Create Hyper-V network adapters for each unique cluster network. Stop there, and make no changes unless you detect a problem.

Comparing the Old Way to the New Way (Gigabit)

Let’s start with a build that would have been common in 2010 and walk through our options up to something more modern. I will only use gigabit designs in this section; skip ahead for 10 gigabit.

In the beginning, we couldn’t use teaming. So, we used a lot of gigabit adapters:

There would be some variations of this. For instance, I would have added another adapter so that I could use MPIO with two iSCSI networks. Some people used Fiber Channel and would not have iSCSI at all.

Important Note: The “VMs” that you see there means that I have a virtual switch on that adapter and the virtual machines use it. It does not mean that I have created a VM cluster network. There is no such thing as a VM cluster network. The virtual machines are unaware of the cluster and they will not talk to it (if they do, they’ll use the Management access point like every other non-cluster system).

Then, 2012 introduced teaming. We could then do all sorts of fun things with convergence. My very least favorite:

This build takes teams to an excess. Worse, the management, cluster, and Live Migration teams will be idle almost all the time, meaning that this 60% of this host’s networking capacity will be generally unavailable.

Let’s look at something a bit more common. I don’t like this one either, but I’m not revolted by it either:

A lot of people like that design because, so they say, it protects the management adapter from problems that affect the other roles. I cannot figure out how they perform that calculus. Teaming addresses any probable failure scenarios. For anything else, I would want the entire host to fail out of the cluster. In this build, a failure that brought the team down but not the management adapter would cause its hosted VMs to become inaccessible because the node would remain in the cluster. That’s because the management adapter would still carry cluster heartbeat information.

My preferred design follows:

Now we are architected against almost all types of failure. In a “real-world” build, I would still have at least two iSCSI NICs using MPIO.

What is the Optimal Gigabit Adapter Count?

Because we had one adapter per role in 2008 R2, we often continue using the same adapter count in our 2012+ builds. I don’t feel that’s necessary for most builds. I am inclined to use two or three adapters in data teams and two adapters for iSCSI. For anything past that, you’ll need to have collected some metrics to justify the additional bandwidth needs.

10 Gigabit Cluster Network Design

10 gigabit changes all of the equations. In reasonable load conditions, a single 10 gigabit adapter moves data more than 10 times faster than a single gigabit adapter. When using 10 GbE, you need to change your approaches accordingly. First, if you have both 10GbE and gigabit, just ignore the gigabit. It is not worth your time. If you really want to use it, then I would consider using it for iSCSI connections to non-SSD systems. Most installations relying on iSCSI-connected spinning disks cannot sustain even 2 Gbps, so gigabit adapters would suffice.

Logical Adapter Counts for Converged Cluster Networking

I didn’t include the Hyper-V virtual switch in any of the above diagrams, mostly because it would have made the diagrams more confusing. However, I would use a Hyper-V team to host all of the logical adapters necessary. For a non-Hyper-V cluster, I would create a logical team adapter for each role. Remember that on a logical team, you can only have a single logical adapter per VLAN. The Hyper-V virtual switch has no such restrictions. Also remember that you should not use multiple logical team adapters on any team that hosts a Hyper-V virtual switch. Some of the behavior is undefined and your build might not be supported.

I would always use these logical/virtual adapter counts:

  • One management adapter
  • A minimum of one cluster communications adapter up to n-1, where n is the number of physical adapters in the team. You can subtract one because the management adapter acts as a cluster adapter as well

In a gigabit environment, I would add at least one logical adapter for Live Migration. That’s optional because, by default, all cluster-enabled networks will also carry Live Migration traffic.

In a 10 GbE environment, I would not add designated Live Migration networks. It’s just logical overhead at that point.

In a 10 GbE environment, I would probably not set aside physical adapters for storage traffic. At those speeds, the differences in offloading technologies don’t mean that much.

Architecting IP Addresses

Congratulations! You’ve done the hard work! Now you just need to come up with an IP scheme. Remember that the cluster builds networks based on the IPs that it discovers.

Every network needs one IP address for each node. Any network that contains an access point will need an additional IP for the CNO. For Hyper-V clusters, you only need a management access point. The other networks don’t need a CNO.

Only one network really matters: management. Your physical nodes must use that to communicate with the “real” network beyond. Choose a set of IPs available on your “real” network.

For all the rest, the member IPs only need to be able to reach each other over layer 2 connections. If you have an environment with no VLANs, then just make sure that you pick IPs in networks that don’t otherwise exist. For instance, you could use 192.168.77.0/24 for something, as long as that’s not a “real” range on your network. Any cluster network without a CNO does not need to have a gateway address, so it doesn’t matter that those networks won’t be routable. It’s preferred, in fact.

Implementing Hyper-V Cluster Networks

Once you have your architecture in place, you only have a little work to do. Remember that the cluster will automatically build networks based on the subnets that it discovers. You only need to assign names and set them according to the type of traffic that you want them to carry. You can choose:

  • Allow cluster communication (intra-node heartbeat, configuration updates, and Cluster Shared Volume traffic)
  • Allow client connectivity to cluster resources (includes cluster communication) and cluster communications (you cannot choose client connectivity without cluster connectivity)
  • Prevent participation in cluster communications (often used for iSCSI and sometimes connections to external SMB storage)

As much as I like PowerShell for most things, Failover Cluster Manager makes this all very easy. Access the Networks tree of your cluster:

I’ve already renamed mine in accordance with their intended roles. A new build will have “Cluster Network”, “Cluster Network 1”, etc. Double-click on one to see which IP range(s) it assigned to that network:

Work your way through each network, setting its name and what traffic type you will allow. Your choices:

  • Allow cluster network communication on this network AND Allow clients to connect through this network: use these two options together for the management network. If you’re building a non-Hyper-V cluster that needs access points on non-management networks, use these options for those as well. Important: The adapters in these networks SHOULD register in DNS.
  • Allow cluster network communication on this network ONLY (do not check Allow clients to connect through this network): use for any network that you wish to carry cluster communications (remember that includes CSV traffic). Optionally use for networks that will carry Live Migration traffic (I recommend that). Do not use for iSCSI networks. Important: The adapters in these networks SHOULD NOT register in DNS.
  • Do not allow cluster network communication on this network: Use for storage networks, especially iSCSI. I also use this setting for adapters that will use SMB to connect to a storage server running SMB version 3.02 in order to run my virtual machines. You might want to use it for Live Migration networks if you wish to segregate Live Migration from cluster traffic (I do not do or recommend that).

Once done, you can configure Live Migration traffic. Right-click on the Networks node and click Live Migration Settings:

Check a network’s box to enable it to carry Live Migration traffic. Use the Up and Down buttons to prioritize.

What About Traffic Prioritization?

In 2008 R2, we had some fairly arcane settings for cluster network metrics. You could use those to adjust which networks the cluster would choose as alternatives when a primary network was inaccessible. We don’t use those anymore because SMB multichannel just figures things out. However, be aware that the cluster will deliberately choose Cluster Only networks over Cluster and Client networks for inter-node communications.

What About Hyper-V QoS?

When 2012 first debuted, it brought Hyper-V networking QoS along with it. That was some really hot new tech, and lots of us dove right in and lost a lot of sleep over finding the “best” configuration. And then, most of us realized that our clusters were doing a fantastic job balancing things out all on their own. So, I would recommend that you avoid tinkering with Hyper-V QoS unless you have tried going without and had problems. Before you change QoS, determine what traffic needs to be attuned or boosted before you change anything. Do not simply start flipping switches, because the rest of us already tried that and didn’t get results. If you need to change QoS, start with this TechNet article.

Your thoughts?

Does your preferred network management system differ from mine? Have you decided to give my arrangement a try? How id you get on? Let me know in the comments below, I really enjoy hearing from you guys!