Exchange Server 2010 : Day-to-day DAG management and operations (part 1)

10/3/2011 9:12:44 AM

Setting up a new DAG is not something to do on a whim, nor is it something that you should attempt unless you have sufficient knowledge of all of the technologies that are involved. These include:

Exchange 2010
Windows Server 2008 and Failover Clustering
Networks

Exchange hides most of the details of dealing with Windows Failover Clustering from the administrator when it builds and maintains a DAG and you should never have to look at the cluster details unless a problem occurs and you have to carry out some investigation to retrieve information, probably under the direction of Microsoft Support. Even so, it’s still a good idea to know the nature of the beast with which you are dealing, if only at a reasonably high level.

In addition, the account that you use needs to have sufficient permissions to be able to install all of the necessary components or to make changes to the Exchange organization or the Windows system. Setting up a new DAG for a test environment is an interesting exercise in the application of a combination of computer software and hardware to solve a problem and it’s something that you need to go through several times before you deploy a DAG into production. When that time comes, you need to understand what form the DAG will take and what purpose it will serve. For example, important considerations include what servers will participate in the DAG, how the servers will communicate to replicate transaction logs, what mailbox databases will be replicated to what servers, where the FSW for the DAG is located, and how the overall design will operate under different circumstances such as an individual or multiserver outage. These details should be written down and validated before any step is taken to create a production-quality DAG. Assuming that all the up-front work is done, you can create a new DAG by running the New Database Availability Group Wizard through EMC or by using the New-DatabaseAvailabilityGroup cmdlet through EMS. Either approach works well, the only major difference being that a DAG created through EMC will use Dynamic Host Configuration Protocol (DHCP) IP addresses for its networks, whereas you have the option to provide static IP addresses for a DAG created with EMS. At least, that’s the way things work in the original version of Exchange 2010.

Microsoft’s thought process was that EMC is designed to provide a very simple administrative experience to encourage use of DAG features in even the smallest sites, so they decided to remove some complexity from the DAG creation wizard. You can always assign static IP addresses for DAG communications by updating the DAG properties after it is created. Administrators protested and wanted maximum flexibility through EMC, too, so Microsoft changed EMC in SP1 to allow the assignment of static IP addresses there as well.

You only have to provide a name for the DAG when you use the wizard to create a new DAG. The name must be unique in the forest and can be up to 15 characters in length. You can also complete the fields to specify the names of the witness server and the directory on the server to hold the share. However, if you leave these fields blank, the wizard will fill them in by searching for a hub transport server in the local site that doesn’t have the mailbox role installed and then create the default directory and share on that server to use.

Figure 1 illustrates the creation of a new DAG through EMC. The operation is extremely simple and the only likely mistake at this stage is the placement of the FSW on a server that might be part of the DAG in the future. If you make a mistake at this point, you can simply update the properties of the DAG to reassign the FSW to a different server (Figure 2).

Figure 1. Creating a new DAG with EMC.

As part of creating the DAG on an Exchange 2010 server, you have to provide a universal naming convention (UNC) share path and directory for use by the FSW for the cluster. This directory can be on a Windows Server 2003 or Windows Server 2008 (SP2 or R2) server and Exchange will create the physical directory as it sets up the DAG. The server that hosts the FSW must be in the same forest as the DAG, but it cannot be a member of the DAG. You can specify a UNC and directory for an alternate file share for the cluster to use if the server that hosts the primary FSW is unavailable.

Figure 2. Viewing DAG properties.

Exchange 2010 SP1 is more flexible and you can do the following:

Only specify the name for the DAG: This forces Exchange to look for a hub transport server in the site that doesn’t have the mailbox role installed. If a suitable server is found, Exchange creates a default directory and file share for the FSW on that server.
Specify the name of the DAG and the directory to use for the FSW: The same search for a hub transport server occurs and Exchange creates the directory and the file share there.
Specify the name of the DAG and the FSW server: Exchange creates a default directory for the FSW on the specified server.
Specify all the parameters for the DAG: Exchange creates the FSW directory with the specified name on the target server.

Note:

A single server can host the FSW for multiple DAGs as long as each DAG is assigned its own unique directory. Microsoft recommends that the easiest way to meet these requirements is to locate the FSW on a hub transport server in the same Active Directory site that hosts the DAG (a DAG can span multiple Active Directory sites, so in this instance you select a hub transport server in one of the sites to host the FSW). One good reason for placing the FSW on an Exchange server is that you can then be sure that an Exchange administrator will be able to manage all of the DAG components.

Because the FSW does not have to be placed on an Exchange server, EMC doesn’t monitor whether the FSW is available. It is also entirely possible for an administrator to remove or wipe a server that is acting as the FSW for one or more DAGs with no complaint from Exchange until the next time that a DAG needs to use the FSW. At that time, Exchange will discover that the server is no longer available and any command that depends on the FSW will fail. Fortunately, the simple fix is to edit the DAG properties with EMC or use the Set-DatabaseAvailabilityGroup cmdlet to move the FSW to another server. In scenarios that stretch a DAG across multiple datacenters, you should locate the FSW in the primary datacenter so that a network interruption between the primary and secondary datacenter doesn’t trigger a failover or halt DAG operations.

If you opt to place the FSW on a server that doesn’t have Exchange 2010 installed, you should first add the Exchange Trusted Subsystem universal security group (USG) to the local Administrators group on the target server before you create the DAG. This step ensures that Exchange is able to create and manage the directory and share on that server. You do not have to take any other step to allow Exchange to manage the FSW; some commentators have blogged that you should add the machine account for the server that hosts the FSW to the Exchange Trusted Subsystem group. Such advice should be ignored because you have to take extreme care when you allow access to the Exchange Trusted Subsystem group. This group permits access to any Exchange object in Active Directory and you need to keep it as restricted as possible. Because it’s really only creating an object in Active Directory to prepare for servers to join the DAG, Exchange can still create the DAG object even if it cannot create the FSW on the nominated server. In this case you’ll see an error similar to that shown in Figure 3 . To fix the problem of an uncreated FSW, you will have to use the Set-DatabaseAvailabilityGroup cmdlet to update the DAG properties with the name of the server hosting the FSW and the directory where the FSW resource is located after you add the Exchange Trusted Subsystem USG to the local Administrators group.

Figure 3. Error creating the FSW during DAG creation.

You’ll see a similar warning about the server hosting the FSW not being an Exchange server even if Exchange is able to create the FSW on a server where the Exchange Trusted Subsystem has been added to the local Administrators group (Figure 4 ). Providing that this is what you intended, you can ignore this warning.

Figure 4. DAG created, but with a warning.

INSIDE OUT: Create a firewall rule for remote management

Firewalls can get in the way of communications—even legitimate communications—so, if the Windows firewall is enabled on the target server, you should create a firewall rule to allow remote management so as to avoid RPC access errors during DAG communications. A rule such as the one shown here is appropriate:

NetshAdvfirewall Firewall set rule group="Remote Administration" new enable=yes

You can also create a DAG through EMS. The following example creates a new DAG called LondonDAG and assigns a static IP address to the DAG. As discussed earlier, you don’t have to register an IP address at this point and indeed, if you create a DAG with EMC, no IP address is registered and Exchange leases an address using DHCP (Microsoft does not support Automatic Private IP Addressing [APIPA] for DAG networks). You can assign an IPv6 address for the DAG, but only if you simultaneously assign an IPv4 address. IPv6 is still unknown territory for most Windows administrators, though, so you should probably plan on using IPv4 only for your DAG deployments until the IPv6 world is a bit more mature.

Windows Failover Clustering registers the IP address as a network resource for the Windows cluster resource that underpins the DAG. Essentially, you can think of the IP resource as a pointer that identifies the DAG in DNS. As we’ll see later on, one IP address is enough for a DAG that spans a single subnet. If you want to add servers to the DAG that span different subnets, you will have to update the DAG object with the Set-DatabaseAvailabilityGroup cmdlet to add new IP addresses for each of the subnets that you want to use. Most companies prefer to use static IP addresses for server resources, so that’s what we will do:

New-DatabaseAvailabilityGroup -Name 'DAG-Dublin'
-DatabaseAvailabilityGroupIPAddresses 192.165.1.8

If you don’t assign an IP address or set the value to 0.0.0.0, Exchange will attempt to lease an address using DHCP. If you use DHCP, Exchange will report the IP address as 0.0.0.0 any time that you look at the DAG properties with the Get-DatabaseAvailabilityGroup cmdlet.

Note:

Get-DatabaseAvailabilityGroup only reports the static properties that are stored in Active Directory. If you want to see the full properties for the DAG, including the properties maintained by Exchange, use Get-DatabaseAvailabilityGroup –Status.

DAGs use Majority Node Set clusters, as do Exchange 2007 CCR and SCR clusters. This means that at least half of the votes that exist within the cluster must be available for the cluster to run. Each server node has a vote, as does the FSW. The cluster adjusts itself automatically to maintain quorum as member servers are joined and leave. After the new DAG is created, it is an empty cluster with no member servers, so the cluster operates in node majority quorum mode with the only vote belonging to the FSW. This is okay as one valid vote out of one is available. As you add mailbox servers to the DAG, the cluster automatically changes to use a node and file share majority quorum and the DAG begins to use the witness server to maintain the quorum. Inside a fullyformed DAG that has all its member servers online, the vote of the FSW is not required to maintain quorum. However, if member servers fail or are taken offline, the vote of the FSW becomes more important. For example, in a four-node DAG, the quorum is three (> 50 percent of four member servers). If half the servers fail, the vote of the FSW is needed to maintain quorum and to keep the cluster online. When it is created, the new DAG is also represented as an empty object in Active Directory (Figure 5 ). We’ll see how server objects are linked to the DAG object as you add servers to build out the DAG.

If you’re in a situation where active users are spread across two datacenters it is better to run two DAGs rather than attempt to run a single DAG with servers in both datacenters. The logic is that a DAG has a single FSW, so if you operate a single DAG, its FSW must be located in one of the datacenters, which in turn creates a potential single point of failure for the users located on servers in the other datacenter. Any network outage that removes access to the FSW from the other datacenter creates a condition where the FSW is inaccessible and cannot be used to maintain quorum. It is therefore better to create two DAGs and locate an FSW in each datacenter.

Figure 5. Viewing a DAG object in Active Directory.

Microsoft’s preferred approach is that you use its continuous replication technology to maintain database copies within a DAG. They also support a replication API to allow other vendors to build DAG solutions with their own feature set. Assuming the third-party solution is installed and available, these DAGs are created with the New-DatabaseAvailabilityGroup cmdlet using the –ThirdPartyReplication parameter. Once a DAG is created based on a third-party solution, it cannot be changed. If you want to revert to use the Exchange technology, you have to remove all servers from the DAG, remove the DAG, and then re-create it.

Note:

See the Microsoft Web site for a current list of approved third-party replication solutions for Exchange 2010.