NLB Traffic Stops on all nodes when a single node is rebooted

Question 1

We had a 2 node NLB cluster running IIS websites on virtual machines. Both nodes were online, the balancer functioned exactly as expected, (if traffic is 50/50 balanced, and you stop or drainstop a node, all traffic routes seamlessles to the other node.)

But when I rebooted a node, even if I stopped it prior to reboot, the OTHER node which should be receiving production traffic during the reboot stopped accepting requests.

To my knowledge this was NOT how NLB is supposed to work. If I power down a node, the other nodes in the NLB cluster shouldn’t care, and should continue to accept traffic according to their port rules while the offline node reboots.

None of my port rules employed affinity, so I knew that wasn’t the issue.

So after agonizing over it a bit, I stumbled on the answer (see my posted answer)

Question 2

After some research, I’ve discovered that the issue is related to VMWare and the fact that the NLB cluster is in Unicast mode.

Apparently VMWare has to be configured properly to avoid issues at the switch level with virtual MAC addresses that are created from Unicast mode NLB clusters, and recommends configuring the NIC’s to accommodate the mode, or better yet, simply use NLB clusters in Multicast mode to avoid the issue entirely.

NLB Unicast Clusters & VMWare

NLB Traffic Stops on all nodes when a single node is rebooted

Answer

Leave a Comment Cancel reply