1. Building the DAG
Immediately after creation, a
new DAG is an empty container that is waiting to be filled with mailbox
servers and their databases that bring functionality to the DAG. Being
able to construct a DAG gradually over time is one of the advantages
that the implementation brings to Exchange. The alternative, which we
see in the implementation of clustered mailbox servers in Exchange 2007,
is to install all of the servers in the cluster at one time. After the DAG is created and its properties verified, you can start to add mailbox servers to the DAG
before creating new database copies that will be managed by the
servers. This process doesn’t have to begin immediately and you can keep
an empty DAG in the organization for as long as you need to before you begin to add servers.
Servers are added and removed from the DAG using the Manage Database Availability Group Wizard through EMC or by using the Add-DatabaseAvailabilityGroupServer
and Remove-DatabaseAvailabilityGroupServer cmdlets. To begin to manage
DAG membership with EMC, go to the Mailbox section of the Organizational
Configuration node and select the DAG from the Database Availability
Groups tab, then right-click to reveal the menu options for DAG
management (Figure 6).
Warning:
Microsoft recommends that you don’t install Exchange
on a domain controller. It therefore follows that they’re not
particularly excited if you add a mailbox server that happens to run on a
domain controller to a DAG. Exchange doesn’t block this action and you
can go ahead and use a domain controller if you really must; hopefully,
you will only do this in a test configuration and not in production use.
After a mailbox server is added to
a DAG, you can manage its membership in the DAG from another Exchange
server. Some administrators prefer to use Server Manager to install the Windows Failover Clustering feature (Figure 7)
before they attempt to add a server to a DAG because they prefer to
make sure that all of the prerequisites are in place before they
proceed.
Figure 8 illustrates the EMC Wizard adding a new server to a DAG.
The following steps occur to
instantiate a DAG with the addition of the first mailbox server.
Exchange has to do quite a lot of work to bring a server into a DAG,
especially if it has to install Windows Clustering, so this process does
not happen quickly.
Exchange validates that the server has the mailbox server role installed and does not host the FSW resource for the DAG.
If not present, Exchange installs Windows Failover Clustering on the mailbox server.
A failover cluster is created using the name of the DAG.
A cluster network object (CNO) is created in the Computers organizational unit (OU) in Active Directory (Figure 9).
The name and IP address of the DAG is registered in DNS as a Host (A) record.
The mailbox server is linked to the DAG object by populating the MSExchMDBAvailabilityGroupLink property on the server object in Active Directory.
The
cluster database is updated with information about the databases hosted
by the newly added server. These databases remain as standalone active
copies until you create additional passive copies through replication to
other servers in the DAG.
Although it is easiest to add servers using EMC, you can also do this through EMS. For example:
Add-DatabaseAvailabilityGroupServer -Identity 'DAG-Dublin' -MailboxServer 'ExServer2'
When you add additional mailbox servers to the DAG, Exchange does the following:
Validates that the server has the mailbox role installed.
Joins the server to the cluster.
Adjusts the quorum model. A node
majority model is used for DAGs with an odd number of members, whereas a
node and file share majority model is used for DAGs with an even number
of members. The quorum model is automatically adjusted as servers join
and leave the DAG, including when they are taken offline for maintenance
or suffer a failure. The adjustment occurs in the background and does
not require any administrator intervention.
Links the server to the DAG object in Active Directory.
Updates the cluster database with information about the databases hosted by the newly added server.
Figure 10 shows details of a DAG with two member servers as viewed through the Windows
Failover Cluster Manager. The networks have been configured
automatically using DHCP and the cluster is configured to use a node and
file share majority quorum (because the cluster is formed by an even
number of servers). In this case, the FSW is hosted on a server that
doesn’t run Exchange, so we had to add the Exchange Trusted Subsystem to
the local Administrators group on the server before using it to host
the FSW. As you can see, there is no obvious indication that the FSW is
on a non-Exchange server.
Although a DAG is visible to
the Failover Cluster Manager, you should never attempt to manage any of
the resources used by the DAG through this console. If you do, don’t
expect much sympathy from Microsoft Support if one of your changes
compromises the integrity of the DAG. Exchange stores many important
properties for a DAG in Active Directory and the only way that you can
manipulate DAG settings properly is through EMC or EMS. If necessary,
the underlying code in the DAG management cmdlets will update the
settings of the Windows cluster.
|
I’ve already stated that
the more developed state of Windows Server 2008 R2 makes it my
preferred platform for Exchange 2010. Development occurs over time in
response to real-life operational experience and Windows clustering is
no different. An example is the design change in Windows Server 2008 R2
to handle situations where clusters could enter a lost quorum
state because the FSW resource was in a failed state even though the
witness directory was available. Microsoft provides a retrofit update in
KB978790 that you can apply to Windows 2008 servers. The change kicks
in when the cluster determines that it is necessary to use the FSW to
maintain quorum. If the FSW resource is failed, the cluster attempts to
kickstart it back into action by bringing the resource online. If the
server that hosts the FSW is available, it can respond and report that
the FSW is available and accessible and the cluster can maintain quorum.
However, if the FSW cannot be brought online, the cluster is in a lost
quorum condition that has to be resolved by an administrator.
Before a server can join a DAG,
it must be able to communicate with the cluster service running on
every other member server that is currently in the DAG. In other words,
you cannot expect to populate a DAG if some servers are offline or
experiencing network problems. There are a number of reasons why
communication might not be possible, including the following:
Servers are powered off or otherwise unavailable.
The cluster service is not running on a server.
Firewall rules on a server are blocking communication to the cluster service.
The DNS service is unavailable.
Authentication problems (Kerberos, Active Directory, or NTLM) are interfering with secure server-to-server communications.
Production servers should have two network
interface cards (NICs) to allow them to isolate MAPI traffic
(interaction with other servers including CAS and Active Directory) and
replication traffic (log shipping and database seeding). This is not a
hard technical requirement from an Exchange perspective because it is
perfectly feasible to have all traffic routed across a single NIC, assuming that the NIC has sufficient capacity to handle the network traffic (Microsoft recommends Gigabit Ethernet for single NICs).
However, Windows Failover Clustering also has a dependency on a solid
network and because dependable replication is so important to the smooth
operation of the DAG, it’s really best if you deploy servers equipped
with dual NICs for DAGs. Additional replication networks can be added as
required and you can take advantage of techniques such as NIC teaming
to improve overall network resilience against failure. You can also consult TechNet to determine how to best to configure network settings for DAG operations.