Windows Server 2003 : Active Directory - Understanding Directory Replication (part 1) - Time Synchronization, Replication Topologies, Handling Update Conflicts

9/18/2012 7:15:33 PM

At its foundation, the replication process is simply an effort to keep the copy of the Active Directory database identical on all domain controllers for a particular domain. For example, if an administrator removes a user from a group, the change is made on the domain controller that the administrator is currently logged into. For those few seconds after the change, that domain controller alone has the most current copy of the database. Eventually, though, after replication takes place, all domain controllers will have exact replicas of the database, including the change in group membership.

1. Within a Site: Loops and Meshes

Active Directory replicates information between domain controllers using different methods, depending on the topology of your network—in particular, how many sites you have configured within Active Directory. In a single-site situation, all domain controllers in a domain will discover each other through published records in both Active Directory and the DNS system for the domain. But to cut down on network traffic, not every domain controller needs to actually replicate with every other domain controller. Active Directory uses a "loop" method. Take, for instance, four domain controllers—A, B, C, and D, as shown in Figure 1.

Figure 1. Looking at all replication topologies in a forest

In this example, Active Directory will replicate using two loops . Let's assume that a change was made on domain controller A. A will tell B and C that it has new information, and eventually B and C will ask A for that information. Once the information is received, both B and C will attempt to tell D about their new information. D will ask for the new information from the first domain controller that reaches it—there isn't a good way to determine if that would be server B or C in our case—but when the second "message" telling D that it has new information arrives, server D will simply respond, acknowledging that it already has that information, and that will be the end of the transmissions because all domain controllers now have the most up-to-date information. In contrast, consider using only one loop and not two. In that case, A would tell B, B would tell C, C would tell D, and D would tell A again. That doesn't happen. In the actual case, news is spread more quickly and network traffic is reduced, making the entire process more efficient. In fact, this entire process triggers every five minutes, and if there's new information, the process will engage. If there is no new information, the domain controllers won't transmit anything; however, if 60 minutes pass without any new information, each domain controller will send a message to its partners, making sure there's no new information.

In simple networks, you usually find each domain controller has two replication partners. However, in more complex environments, domain controllers can have more than two partners. To see a domain controller's replication partners, open Active Directory Sites and Services, expand the site in question in the left pane, and expand each domain controller's node. Click NTDS Settings in the left pane, and in the right pane, note the two servers listed in the From Server column.

You might wonder how this loop is designed. The Knowledge Consistency Checker, or KCC, wakes up approximately every 15 minutes and tries to detect changes in its idea of how many domain controllers there are and where they're located. The KCC will look at any changes that have occurred—you might have taken a domain controller offline for maintenance, for example, or even added a new domain controller for load control purposes. Then it adjusts the loop for best performance.

In larger sites, the KCC might find it necessary to add more than two replication partners for each domain controller, or it might do so for traffic control purposes. In still larger sites, even those with only two replication partners per domain controller, it can take more than three hops to transmit replication information completely. The KCC looks for this situation and, if it detects it, simply adds more links between domain controllers, changing the simple "loop" structure into more of a "mesh" structure.

2. Time Synchronization

For replication to function properly, it is crucial for all domain controllers in a domain and/or forest to be in sync in terms of the current time. The reason points to Kerberos, the underlying authentication scheme for the entire Active Directory: if any domain controller is more than five minutes out of synchronization, authentication will fail.

The Windows Time Service is the tool Microsoft provides to consistently keep your entire domain or forest at the same moment in time. Windows Time Service offers a hierarchy for members of Active Directory domains and forests, with the machine holding the PDC emulator operations master role being the "big kahuna" of sorts, holding the trusted time. The trusted time at the very top level does not need to be synchronized from anywhere—synchronization matters only within the domain, as all members must think it is the same time, regardless of whether that time is the actual time. In other words, everyone has to be the same, but it doesn't matter if everyone is wrong.

From the bottom up, the rest of the hierarchy looks something like this:

Workstations and servers that are not domain controllers will synchronize their time with the domain controller that logged them in.
Domain controllers will contact the domain controller for their domain with the PDC emulator operations master role for the current time.
Each domain in a forest with multiple domains will look to the PDC emulator operations master-holding domain controller in the forest root—the first domain in the forest—to keep the other PDC emulator domain controllers in other domains in check.

You can synchronize the domain controller at the PDC emulator operations master role in a few ways, through the command line. First, though, you must choose a time source. Microsoft provides the host time.windows.com , which is synchronized to the U.S. Army's Atomic Clock, which is as good a choice as any. Once you have selected a time source, run the following from the command line of the PDC emulator domain controller:

    net time /setsntp:
    <TIMESOURCE>

Replace <TIMESOURCE> with the full DNS name of the time source you have selected. For example, if I were using time.windows.com as my time source, I'd run:

    net time /setsntp:time.windows.com

Once you have set the time source for the PDC emulator domain controller, it will attempt to synchronize its time with the time source. It will try once every 45 minutes until it has successfully synchronized three times in a row. Once it has done so, it pings the time server only once every eight hours. If you want to trigger time synchronization manually, run:

    w32tm /resync

The Windows Time Service requires outbound UDP port 123 to be open on your firewall for time synchronizations to occur.

Time zones also play a role. Windows operates internally at Greenwich Mean Time, and although each server can be set to a different time zone depending on either its physical location or the location of the administrator who manages the box, within Windows itself the current time is translated to GMT. Be wary of this, and ensure that time zones are set correctly on all your servers. The object is to get Windows' internal clocks to synchronize—even though the time might seem right to the naked eye, if the time zone is set incorrectly, Windows is unforgiving when it comes to Active Directory operations in that state.

3. Replication Topologies

Loops and meshes are just two examples of what Microsoft terms replication topologies —essentially, maps of the ways domain controllers replicate to each other. And to confuse things, almost always, more than one replication topology exists simultaneously within any particular forest. Let's take a closer look at that.

Four types of data need to be replicated among domain controllers:

Updates that stay within a particular domain—username and password changes, and other user account information
Updates to the schema naming context and configuration naming context, which are specific to all domains with a forest
Updates to the GC, which replicate to all domain controllers that function as GC servers
Updates to DNS partitions and custom application partitions

With many domain controllers in a forest, you can see where one replication topology might either not suffice or not be the most efficient way to transmit information between these selected subgroups of domain controllers. Figure 2 shows this scenario graphically.

Figure 2. Creating a new site link

The Active Directory Sites and Services console again comes to your aid if you want to try to piece together all these replication topologies for your environment. Open the console, expand the site in question in the left pane, and expand each domain controller's node. Click NTDS Settings in the left pane, and in the right pane double-click the "<automatically generated>" objects.

If you see <Enterprise Configuration> in one of the fields at the bottom of the screen indicating replicated naming contexts, it shows that that particular link replicates the schema and configuration naming contexts. In the Partially Replicated Naming Context field, if you see a server name, this indicates that your server is a GC server and is receiving updates from the GC server listed in the field. It is perfectly acceptable for this field to be empty on servers not acting as GCs.

4. Handling Update Conflicts

Replication is great in and of itself, but there is one major, inherent problem—each domain controller is using its own copy of the database and, no matter how often each copy is updated, for a few moments in time each copy is unaware of actions taken on other copies of the database around the network. How might this design situation manifest itself as a problem?

Consider a large site, with branch offices in Sydney, Australia, Boston, and Los Angeles. An employee, Robert Smith, is being transferred to the Sydney office from L.A. because of personnel reorganization. The company uses groups within Active Directory, SYDUSERS and LAUSERS, for distribution list purposes and other security boundary assignments. On Robert's last day in the LA office, his manager changes Robert's group membership, moving him from LAUSERS to SYDUSERS in anticipation of his transfer. The Los Angeles domain controller notes this change and creates a record looking roughly like this:

    Object: LAUSERS
    Change: Remove RSMITH
    Version: 1
    Timestamp: 30 June 2004 5:30:01 PM GMT
    Object: SYDUSERS
    Change: Add RSMITH
    Version: 1
    Timestamp: 30 June 2004 5:30:02 PM GMT

Look closely at these records. They are denoting changes to attributes of objects—in this case, the member list is an attribute of a particular group object—not changes to the entire object. This is important for network traffic reduction reasons; if the LAUSERS group is composed of 2,000 members, it's good to transmit only the removal of RSMITH and not the entire membership list. Also, note the version numbers: the field is very simple and is designed to be used whenever domain controllers update an attribute for a particular object. Each time a change is made to a particular object attribute, the numeral in the version number field is incremented by 1. So, one object can have many version numbers, each representing the attributes of that object.

With that background out of the way, let's return to our fictional situation. Perhaps there was a miscommunication between Robert's old manager in Los Angeles and his new manager in Sydney, and each incorrectly thought they were supposed to make the change in group membership with Active Directory. So, at almost exactly the same time (we'll ignore time zone differences for the purposes of this demonstration), Robert's new manager makes the previously described change, which is recorded on the Sydney domain controller as follows:

    Object: LAUSERS
    Change: Remove RSMITH
    Version: 1
    Timestamp: 30 June 2004 5:32:08 PM GMT
    Object: SYDUSERS
    Change: Add RSMITH
    Version: 1
    Timestamp: 30 June 2004 5:32:10 PM PT

There are two things to note about this record: one is the closeness of the timestamps. This would seem to indicate that the L.A. and Sydney domain controllers haven't replicated yet. The second item of interest is the version number field in each record, which does not appear to have been incremented. The reason for this is simple: version numbers are incremented on the local domain controller. If a domain controller doesn't know about any changes to an attribute, there is no need to further increment the version number on that record. Because L.A. and Sydney haven't passed changes between each other yet, the Sydney domain controller doesn't know that a similar change has processed on the L.A. domain controller and therefore doesn't know to increment the version number field from 1 to 2.

This might seem like a harmless situation now because even though the changes are different only in time, their net effect is the same—on both domain controllers, RSMITH is a member of the correct group and not a member of the former group. But in reality there are two changes. So, which one is really the change accepted by Active Directory? And to ask a more specific question, when both the L.A. and Sydney domain controllers replicate to their partner, the Boston domain controller, which change will Boston accept?

The tie is broken in two ways when changes to the same object compete:

First, the attribute change with the highest version number is the change formally accepted.
If the version number of each attribute change is the same, the change made at the most recent time is accepted.

In our case, the change made on the Sydney domain controller would be the one formally accepted in Active Directory, and the L.A. manager's modification, although its intent was the same, would be rejected because it was made at 5:30 p.m. and not at 5:32 p.m.

Windows Server 2003 : Active Directory - Understanding Directory Replication (part 3) - Spanning Trees and Site Links

Windows Server 2003 : Active Directory - Understanding Directory Replication (part 2) - Update Sequence Numbers

Other

Windows Server 2003 : Active Directory - Understanding Operations Master Roles

Windows Vista : Customizing Windows PE Boot Images (part 3) - Working with OSCDImg, Working with vLite

Windows Vista : Customizing Windows PE Boot Images (part 2) - Working with an ImageX GUI, Working with PEImg

Windows Vista : Customizing Windows PE Boot Images (part 1) - Working with ImageX

How To Buy Graphics Cards!

Windows 7 : Protecting Your Data from Loss and Theft - Creating a File and Folder Backup

Windows 7 : Protecting Your Data from Loss and Theft - The All New Backup and Restore

Writing 64-Bit Applications for Windows 7 (part 2)

Writing 64-Bit Applications for Windows 7 (part 1) - OVERCOMING 64-BIT DEVELOPMENT ISSUES

Developing a Windows 7 Strategy : DETERMINING THE USER WINDOWS 7 COMFORT LEVEL