류짱:Beyond MySelf

Windows Server 2008 Failover Cluster 네트워크 본문

Microsoft/Failover Cluster

Windows Server 2008 Failover Cluster 네트워크

リュちゃん 2012. 3. 13. 12:17

 

Windows Server 2008 Failover Cluster networking features

A new cluster network driver architecture
The ability to locate cluster nodes on different, routed networks in support of multi-site clusters
Support for DHCP assigned IP addresses
Improvements to the cluster health monitoring (heartbeat) mechanism
Support for IPv6

New cluster network driver architecture
The legacy cluster network driver (clusnet.sys) has been replaced with a new NDIS level driver called the Microsoft Failover Cluster Virtual Adapter (netft.sys)

The goal of the new driver model is to sustain TCP/IP connectivity between two or more systems despite the failure of any component in the network path. This goal can be achieved provided at least one alternate physical path is available. In other words, a network component failure (NIC, router, switch, hub, etc…) should not cause inter-node cluster communications to break down, and communication should continue making progress in a timely manner (i.e. it may have a slower response but it will still exist) as long as an alternate physical route (link) is still available.

 If cluster communications cannot proceed on one network, the switchover to another cluster-enabled network is automatic. This is one of the primary reasons that each cluster node must have multiple network adapters available to support cluster communications and each one should be connected to different switches.

The failover cluster virtual adapter is implemented as an NDIS miniport adapter that pairs an internally constructed virtual route with each network found in a cluster node. The physical network adapters are exposed at the IP layer on each node. The NETFT driver transfers packets (cluster communications) on the virtual adapter by tunneling through the best available route in its internal routing table

the default port for cluster communication is still TCP\UDP: 3343.

When the cluster service starts, and a node either Forms or Joins a cluster, NETFT, along with other components, is responsible for determining the node’s network configuration and connectivity with other nodes in the cluster. One of the first actions is establishing connectivity with the Microsoft Failover Cluster Virtual Adapter on all nodes in the cluster.

Beginning with Windows Server 2008 failover clustering, individual cluster nodes can be located on separate, routed networks. This requires that resources that depend on IP Address resources (i.e. Network Name resources), implement an OR logic since it is unlikely that every cluster node will have a direct local connection to every network the cluster is aware of. This facilitates IP Address and hence Network Name resources coming online when services\applications failover to remote nodes. Here is an example (Figure 10) of the dependencies for the cluster name on a machine connected to two different networks.

All IP addresses associated with a Network Name resource, which come online, will be dynamically registered in DNS (if configured for dynamic updates). This is the default behavior. If the preferred behavior is to register all IP addresses that a Network Name depends on, then a private property of the Network Name resource must be modified. This private property is called RegisterAllProvidersIP (Figure 11). If this property is set equal to 1, all IP addresses will be registered in DNS and the DNS server will return the list of IP addresses associated with the A-Record to the client.

Since cluster nodes can be located on different, routed networks, and the communication mechanisms have been changed to use reliable session protocols implemented over UDP (unicast), the networking requirements for Geographically Dispersed (Multi-Site) Clusters have changed. In previous versions of Microsoft clustering, all cluster nodes had to be located on the same network. This required ‘stretched’ VLANs be implemented when configuring multi-site clusters. Beginning with Windows Server 2008, this requirement is no longer necessary in all scenarios.

Improvements to the cluster ‘heartbeat’ mechanism

The cluster ‘heartbeat’, or health checking mechanism, has changed in Windows Server 2008. While still using port 3343, it is no longer a broadcast communication It is now unicast in nature and uses a Request-Reply type process. This provides for higher security and more reliable packet accountability.

The default configuration (shown here) means the cluster service will wait 5.0 seconds before considering a cluster node to be unreachable and have to regroup to update the view of the cluster (One heartbeat sent every second for five seconds). The limits on these settings are shown in Figure 17. Make changes to the appropriate settings depending on the scenario. The CrossSubnetDelay and CrossSubnetThreshold settings are typically used in multi-site scenarios where WAN links may exhibit higher than normal latency.

These settings allow for the heartbeat mechanism to be more ‘tolerant’ of networking delays. Modifying these settings, while a worthwhile test as part of a troubleshooting procedure (discussed later), should not be used as a substitute for identifying and correcting network connection delays.