ENTERPRISE

Optimizing Exchange Server 2010 Servers

2/13/2011 9:06:34 AM
With the separation of various roles in Exchange Server 2010, individual optimizations vary from role to role. The following sections address the various roles in Exchange Server 2010 and how to optimize the performance of those roles.

Optimizing Mailbox Servers

Of all the servers in an Exchange Server 2010 environment, the Mailbox server role is the one that will likely benefit the most from careful performance tuning.

Mailbox servers have traditionally been very dependent on the disk subsystem for their performance. Although this has changed in Exchange Server 2010, it is important to understand that this change in disk behavior is very dependent on memory. As such, the general rule for performance on an Exchange Server 2010 Mailbox server is to configure it with as much memory as you can. For example, in Exchange Server 2003, if you had a load of 2,000 users that generated an average of 1 disk I/O per second and you were running a RAID 0+1 configuration, you would need 4GB of memory and 40 disks (assuming 10k RPM disk and 100 random disk I/O per disk) to get the performance you’d expect out of an Exchange server. In Exchange Server 2010, the I/O load per user would be closer to 0.15 disk I/O per second and you could reduce the number of disks required by roughly 85% if you increased the system memory to 12GB of memory.

As you can see, with a mailbox server, the trick is to balance costs against performance. In large implementations, it is less expensive to replace high-performance disks with memory. This makes direct attached disks a viable choice for Exchange Server 2010 mailbox servers. In modern servers, configurations of 16GB or more are becoming commonplace and are quite affordable.

Another area where a mailbox server benefits in terms of performance is the disk subsystem. Although you’ve just seen that the disk requirements are lower than previous versions of Exchange Server, this doesn’t mean that the disk subsystem is unimportant. This is another area where you must create a careful balance between cost, performance, and recoverability. The databases benefit the most from a high-performance disk configuration. Consider using 15k RPM drives because they offer more I/O performance per disk; generally 50% more random I/O capacity versus a 10k RPM disk. Given the reduction in disk needed to support the databases, you should consider using RAID 0+1 rather than RAID 5 so as not to incur the write penalties associated with RAID 5. The log files also need fast disks to be able to commit information quickly, but they have the advantage of being sequential I/O rather than random. That is, the write heads don’t have to jump all around the disk to find the object to which they want to write. The logs start at one point on the disk and they write sequentially without having to modify old logs.

Exchange Server 2010 has also implemented several changes that specifically improve performance when utilizing SATA disks. By altering the I/O pattern within Exchange Server 2010, the disk writes are better spread out and are less “bursty” than before. This makes SATA a very viable choice for Exchange Server 2010.

Another interesting performance benefit available in Exchange Server 2010 is the potential to eliminate RAID entirely on the Exchange Server 2010 server. If one is using Database Availability Groups with 3 or more replicas of mailbox databases, there is enough redundancy in the system as a whole to allow one to eliminate redundancy at the individual server level. If the 30 second “blip” in services is acceptable within one’s Service Level Agreements, one could potentially remove all redundant disks and power supplies and NICs from their Exchange Server 2010 mailbox servers. This would greatly reduce the number of disks that the server needs to be able to support, often resulting in the ability to deploy less expensive servers and thus to deploy more of them. So while one isn’t likely to save money in hardware, as there is a need to deploy additional disks, one can likely create a significantly more resilient environment for approximately the same cost. The easiest way to look at this idea of redundancy at the application level rather than redundancy at the server level is to think about Active Directory. In AD, if a domain controller fails, it’s generally not a big deal so long as there are other DCs to take over the load. Rather than fixing a DC, one simply builds a new one and lets it replicate the directory. The same philosophy can apply to Exchange Server 2010 where if a DAG member fails, rather than restore it, one simply builds a new one and lets it replicate the data. Another DAG member would have already taken over for the failed node and services would not be significantly interrupted.

In a perfect world, the databases and logs are all on their own dedicated disks. Although this isn’t always possible, it does offer the best performance. In the real world, you might have to occasionally double up databases or log files onto the same disk. Be aware of how this affects recoverability. For performance, always separate the logs from the databases as their read/write patterns are very different. It also makes recovery of a lost database much easier.

Mailbox servers also deal with a large amount of network traffic. Email messages are often fairly small and as a result, the transmission of these messages isn’t always as efficient as it could be. Whenever possible, equip your mailbox servers with Gigabit Ethernet interfaces. If possible, and if you aren’t clustering the mailbox servers, try to run your network interfaces in a teamed mode. This improves both performance and reliability.

As mailbox servers also hold the public folder stores, consider running a dedicated public folder server if your environment heavily leverages public folders. Public folder servers often store very large files that users are accessing, so separating the load of those large files from the mailbox servers results in better overall performance for the user community.

For companies that only lightly use public folders, it requires some investigation of the environment to see if it is better to run a centralized public folder server or if it is better to maintain replicas of public folders in multiple locations. This is usually a question of wide area network (WAN) bandwidth versus usage patterns.

Optimizing Database Availability Groups

Mailbox servers in Exchange Server 2010 offer a new function known as Database Availability Groups. In a DAG configuration, mailbox data is replicated across multiple hosts. As such, it becomes less important to build in system level redundancy. The best analogy is to think of DAG nodes like domain controllers. As long as there’s at least one up and running, your users are still accessing their mail normally. If one were to fail, rather than rebuild it, one simply builds another node, adds it as a replica, and lets the data replicate to it. This allows an administrator to drastically change the way they deploy servers. Rather than have to struggle with the price versus performance tradeoffs of RAID5 vs. RAID0+1, administrators can consider just running basic disks with no redundancy whatsoever. This makes smaller servers viable as one isn’t likely to need nearly as many disks in the chassis. This makes it even easier for administrators to move away from complex and expensive SANs back toward direct attached storage. In this case, optimizing doesn’t always mean making things faster and more scalable, sometimes optimizing is about doing more with less.

When configuring a DAG, consider utilizing a separate network for replication. DAGs offer the ability to configure the nodes to use a specific network for their replication traffic. This offers two potential benefits. Number one, in a LAN scenario, it means that clients aren’t competing with replication traffic for access to their mailboxes over the LAN. In a WAN configuration, it means that if an environment has access to multiple networks, they can potentially move replication traffic to a lower cost network. Consider a typical scenario where a large enterprise has their offices connected via an MPLS network. MPLS provides excellent bandwidth and performance but is generally somewhat expensive. Many of these large enterprises also have IPSec tunnels set up across Internet connections to provide a secondary network to use in case of a failure of the MPLS links. These cheaper IPSec tunnels can be used to offload the DAG replication. This reduces the load on the “production” network and at the same time, saves money by utilizing a lower tier of bandwidth.

The other way to optimize DAG members is to balance the load across multiple DAG replicas. This is a concept that is new compared to CCR in Exchange Server 2007. In Exchange Server 2010, the replication is done at the database level rather than at the server level. This means that rather than running in Active/Passive pairs, one can effective be active for one or more databases and passive for others on the same server. So a site might have 3 DAG members in a single location for redundancy and could run 1/3rd of the databases as “master” on node 1, 1/3rd on node 2 and the remaining 1/3rd on node 3, with replicas going to the other 2 nodes. Node 1 might be “master” for databases 1-5, the 2nd priority for databases 6-10 and 3rd priority for databases 11-15. This means that if a node failed, the load would double on the remaining 2 servers rather than tripling on a single node. This allows administrators to get the best performance out of their hardware by carefully planning out loads for both a “normal” and a “DR” situation.

Optimizing Client Access Servers

Client access servers (CASs) tend to be more dependent on CPU and memory than they are on disk. Because their job is to simply proxy requests back to the mailbox servers, they don’t need much in the way of local storage. The best way to optimize the client access server is to give it enough memory that it doesn’t need to page very often. By monitoring the page rate in the Performance Monitor, you can ensure that the CAS is running optimally. If it starts to page excessively, you can simply add more memory to it. Similarly, if the CPU utilization is sustained above 65% or so, it might be time to think about more processing power.

Unlike mailbox servers, client access servers are usually “commodity” class servers. This means they aren’t likely to have the capacity for memory or CPU that a mailbox server might have. It is typical to increase the performance of the client access servers by simply adding more servers into a load-balanced group.

This is a good example of optimizing a role as opposed to a server that holds a role. As you start to add more services to your client access servers, such as Outlook Anywhere or ActiveSync, you will see an increase in CPU usage. Be sure to monitor this load because it allows you to predict when to add capacity to account for the increased load. This prevents your users from experiencing periods of poor performance.

Exchange Server 2010 utilizes the client access servers much more than Exchange Server 2007 did. This is because the new Exchange Server 2010 architecture utilizes what is called MAPI on the Middle Tier. This means that Outlook clients no longer talk directly to the mailbox server role. They instead talk to the Client Access server role. The reason this change was made was to support Database Availability Groups. Rather than requiring Outlook to determine which copy of a database is currently the master copy, the client access servers perform that role. As such, Outlook merely needs to find an available client access server to make its connections. As a result of this architectural change, the ratio of client access servers to mailbox servers has gone up noticeably. It is now recommend to run a ratio of 3 CAS processor cores for every 4 mailbox processor cores. Maintaining this ratio will ensure sufficient performance for Outlook users.

It is also recommended to always run at least 2 client access servers per site to ensure uninterrupted services for users.

Generally speaking a client access server should run at least 2GB of memory and 2GB per processor core are recommended. Client access servers don’t scale well beyond 8 cores.

Another way in which the Client Access server role can be optimized is in the area of how users attach to it. One of the most common requests in Exchange Server from and OWA perspective is to configure things such that users don’t have to remember the full URL for connecting to OWA. For example, your users might need to type https://webmail.companyabc.com/owa to get to their OWA page, but many users will type https://webmail.companyabc.com instead. In the past, it was recommended to utilize a customized ASP page to make the redirection. In Windows 2008, the redirection functionality is built it. When configuring IIS for Exchange Server 2010 CAS, be sure to include the HTTP Redirection feature. With this available, one can reconfigure the IIS site as follows:

1.
Launch the Internet Information Services (IIS) Manager.

2.
Expand the left pane to Default Web Site.

3.
Click the Features View and Group by No Grouping.

4.
Double click HTTP Redirect, as shown in Figure 1.

Figure 1. Choosing HTTP Redirect.

5.
Check the box for Redirect Requests to This Destination, as shown in Figure 2, and set the destination to the /owa sub site.

Figure 2. HTTP Redirect.

6.
Check both boxes in the Redirect Behavior section.

7.
Click Apply.

8.
Uncheck the redirection settings from all sub sites in IIS.

Optimizing Hub Transport Servers

The goal of the Hub Transport server is to transfer data between different Exchange Server sites. Each site must have a Hub Transport server to communicate with the other sites. Because the Hub Transport server doesn’t store any data locally, its performance is based on how quickly it can determine where to send a message and send it off. The best way to optimize the Hub Transport role is via memory, CPU, and network throughput. The Hub Transport server needs ready access to a global catalog server to determine where to route messages based on the recipients of the messages. Placing a global catalog (GC) in the same site as a busy Hub Transport server is a good idea. Ensure that the Hub Transport server has sufficient memory to quickly move messages into and out of queues. Monitoring available memory and page rate gives you an idea if you have enough memory. High-speed network connectivity is also very useful for this role. If you are running a dedicated Hub Transport server in a site and you find that it’s overworked even though it has a fast processor and plenty of memory, consider simply adding a second Hub Transport server to the site because they automatically share the load.

Disk performance is potentially a concern in environments that send very high volumes of messages. The Hub Transport maintains the SMTP queues on disk and the faster they can be processed, the faster mail can flow. In older versions of Exchange Server it was recommended to run redundant disks for the SMTP queues. This was because if a Hub Transport lost the disks on which the SMTP queues lived, the messages would be lost. In Exchange Server 2010, the shadow redundancy feature protects against this type of message loss. Basically, when the mailbox server hands a message to the Hub Transport, the mailbox server doesn’t consider the message as “sent” until the Hub Transport reports back that it successfully handed the message off to someone else. This means that if the message were sitting in the queue on the Hub Transport and that Hub Transport failed, the mailbox server would know the message never got past the Hub Transport and it would attempt to resend the message via another Hub Transport in the site. When the Hub Transport reports that it’s handed the message to another hop, either to another mailbox server or to another Hub Transport, it would report back and the message would be “sent” from the perspective of the mailbox server. This means that administrators can design their Hub Transport servers without the requirement for redundant disks for the SMTP queues. By taking the approach of building Hub Transports on non-redundant commodity hardware, one can likely build two Hub Transports for about the same cost as one highly redundant Hub Transport. This makes it easy to build sites with multiple Hub Transport servers to provide redundancy. This means that administrative and maintenance tasks can be done at any time without interrupting services for an Exchange Server 2010 site.

As was the case with Exchange Server 2007, in Exchange Server 2010, all messages, regardless of destination, must first pass through a Hub Transport server. This means that the Hub Transport server is the logical choice for where to place anti-virus scanning. When planning out a Hub Transport server, make sure you account for your anti-virus product. This may mean adding additional memory or processing power to the Hub Transport server. Also, keep in mind, if you are anticipating utilizing a large number of Hub Transport rules, that may result in a need for faster processors or more memory.

Generally speaking, a Hub Transport should run at least 2GB of memory and 1GB per processor core is recommended.

Optimizing Edge Transport Servers

The Edge Transport server is very similar to the Hub Transport server, with the key difference being that it is the connection point to external systems. As such, it has a higher need for processing power because it needs to convert the format of messages from Simple Mail Transfer Protocol (SMTP) to Messaging Application Programming Interface (MAPI) for internal routing. Edge Transport servers are often serving “double duty” as antivirus and antispam gateways, thus increasing the need for CPU and memory. The Edge Transport role is one where it is very common to optimize the service by deploying multiple Edge Transport servers. This not only increases a site’s capacity for sending mail into and out of the local environment, but it also adds a layer of redundancy.

To fully optimize this role, consider running Edge Transport servers in two geographically disparate locations. Utilize multiple MX records to balance out the load of mail coming into the company. Use your route costs to control the outward flow of mail such that you can reduce the number of hops needed for mail to leave the environment.

Keep a close eye on CPU utilization as well as memory paging to know when you need to add capacity to this role. Utilizing content-based rules or running message filtering increases the CPU and memory requirements of this role.

Generally speaking, an Edge Transport should run at least 2GB of memory and 1GB per processor core is recommended.

Optimizing Unified Messaging Servers

The Unified Messaging server is still supported in the Exchange Server 2010 world to act as a connector to voice mail systems to allow voice mail to be stored in and accessed through the users’ mailboxes. In the past, this type of functionality was always performed by a third-party application. In Exchange Server 2010, this ability to integrate with phone systems and voice mail systems in built in. As you might expect, to optimize this role, you must optimize the ability to quickly transfer information from one source to another. This means that the Unified Messaging role needs to focus on sufficient memory, CPU, and network bandwidth. To fully optimize Unified Messaging services, strongly consider running multiple network interfaces in the Unified Messaging server. This allows one network to talk to the phone systems and the other to talk to the other Exchange servers. Careful monitoring of memory paging, CPU utilization, and NIC utilization allows you to quickly spot any bottlenecks in your particular environment.

Generally speaking, a Unified Messaging server should run at least 4GB of memory and 1GB per processor core is recommended.

Deployment Ratios

To summarize the deployment ratios of various roles within Exchange Server 2010:

  • 7 mailbox server processor cores per 1 Hub Transport server processor core

  • 4 mailbox processor cores per 3 client access server processor cores

  • 4 mailbox processor cores per 1 global catalog processor core (32-bit GC)

  • 8 mailbox processor cores per 1 global catalog processor core (64-bit GC)

General Optimizations

Certain bits of advice can be applied to optimizing any server in an Exchange Server 2010 environment. For example, the elimination of unneeded services is one of the easiest ways to free up CPU, memory, and disk resources. Event logging should be limited to the events you care about and you should be very careful about running third-party agents on your Exchange Server 2010 servers.

Event logs should be reviewed regularly to look for signs of any problems. Disks that are generating errors should be replaced and problems that appear in the operating system should be addressed immediately.

Always investigate any anomalies to determine if things have been changed or if you are suffering a potential problem. By staying on top of your systems and knowing how they should run, you can more easily keep them running in an optimal manner.

The Security Customization Wizard should be run to ensure that correct network ports for Exchange Server 2010 roles are available. While many administrators are tempted to simply disable the Windows 2008 firewall, it is important to realize that the Windows Filtering Platform is still running and can potentially interrupt traffic.

Optimizing Active Directory from an Exchange Server Perspective

As you likely already know, Exchange Server 2010 is very dependent on Active Directory for routing messages between servers and for allowing end users to find each other and to send each other mail. The architecture of Active Directory can have a large impact on how Exchange Server performs its various functions.

When designing your Exchange Server 2010 environment, consider placing dedicated global catalog servers into an Active Directory site that contains only the GCs and the local Exchange servers. Configure your site connectors in AD with a high enough cost that the GCs in this site won’t adopt another nearby site that doesn’t have GCs. This ensures that the GCs are only used by the Exchange servers. This can greatly improve the lookup performance of the Exchange server and greatly benefits your OWA users as well.

In the case of a very large Active Directory environment, for example 20,000 or more objects, consider upgrading the domain controllers to run Windows Server 2003 64-bit or Windows Server 2008 64-bit. This is because a directory this large can grow to be larger than 3GB. When the Extensible Storage Engine database that holds Active Directory grows to this size, it is no longer able to cache the entire directory. This increases lookup and response times for finding objects in Active Directory. By running a 64-bit operating system on the domain controller, you can utilize the larger memory space to cache the entire directory. The nice thing in this situation is that you retain compatibility with 32-bit domain controllers, so it is not necessary to upgrade the entire environment, only sites that will benefit from it.

Other  
  •  Business Intelligence in SharePoint 2010 with Business Connectivity Services : Consuming External Content Types (part 3) - Business Connectivity Services Web Parts
  •  Business Intelligence in SharePoint 2010 with Business Connectivity Services : Consuming External Content Types (part 2) - Writing to External Content Types
  •  Business Intelligence in SharePoint 2010 with Business Connectivity Services : Consuming External Content Types (part 1) - External Lists & External Data
  •  Optimizing an Exchange Server 2010 Environment : Analyzing Capacity and Performance
  •  Examining Exchange Server 2010 Performance Improvements
  •  Recovering from a Disaster in an Exchange Server 2010 Environment : Recovering Active Directory
  •  Business Intelligence in SharePoint 2010 with Business Connectivity Services : External Content Types (part 3) - Creating an External Content Type for a Related Item
  •  Business Intelligence in SharePoint 2010 with Business Connectivity Services : External Content Types (part 2) - Defining the External Content Type
  •  Business Intelligence in SharePoint 2010 with Business Connectivity Services : External Content Types (part 1)
  •  Recovering from a Disaster in an Exchange Server 2010 Environment : Recovering from Database Corruption
  •  Recovering from a Disaster in an Exchange Server 2010 Environment : Recovering Exchange Server Application and Exchange Server Data
  •  Recovering from a Disaster in an Exchange Server 2010 Environment : Recovering from a Complete Server Failure
  •  Sharepoint 2007: Add a Column to a List or Document Library
  •  Sharepoint 2007: Create a New Document Library
  •  Sharepoint 2007: Open the Create Page for Lists and Libraries
  •  Exchange Server 2010 : Developments in High Availability (part 3) : Backup and restore
  •  Exchange Server 2010 : Developments in High Availability (part 2) : Configuring a Database Availability Group & Managing database copies
  •  Exchange Server 2010 : Developments in High Availability (part 1) : Exchange database replication & Database Availability Group and Continuous Replication
  •  High Availability in Exchange Server 2010 : Exchange Server database technologies
  •  SharePoint 2010 : Cataloging the Best Scripts to Automate SharePoint Administration
  •  
    Top 10
    Sigma 35mm f1.4 EX HSM Lens Review (Part 2)
    Sigma 35mm f1.4 EX HSM Lens Review (Part 1)
    Swiftech H220 Liquid Cooling System Review (Part 3)
    Swiftech H220 Liquid Cooling System Review (Part 2)
    Swiftech H220 Liquid Cooling System Review (Part 1)
    HTC Desire X - Last Applause For An Ex-Flagship Product (Part 3)
    HTC Desire X - Last Applause For An Ex-Flagship Product (Part 2)
    HTC Desire X - Last Applause For An Ex-Flagship Product (Part 1)
    Audeze LCD-2 Headphones At Music Direct (Part 2)
    Audeze LCD-2 Headphones At Music Direct (Part 1)
    Most View
    Exchange Server 2007 : Establish EdgeSync Subscriptions
    Developing Applications for the Cloud on the Microsoft Windows Azure Platform : DNS Names, Certificates, and SSL in the Surveys Application
    ASP.NET Server-Side Support for AJAX & AJAX Client Support
    Security Pros Get Caught Out By QR Codes
    Lab Test: Satellite - SKY 2TB (Part 1)
    Samsung Galaxy Note 800 - A Fresh Challenge
    Windows Server 2008 : Creating Batch Files (part 2)
    How To Make A USB SSD Case
    SQL Server 2008 : Index analysis (part 1) - Identifying indexes to drop/disable
    Internet Security and Acceleration Server 2004 : Additional Configuration Tasks
    App attack: TV Remotes - Panasonic Viera, LG TV Remote, Media Remote (Sony) & Samsung Remote
    Windows Server 2008 : Creating Batch Files (part 1) - Using Notepad, Giving Feedback with echo, Using Parameters
    Executing Work on a Background Thread with Updates
    Secure Browsing and Local Machine Lockdown in Vista
    BlackBerry Java Application Development : Networking - Testing for availability of transports
    Tips & Tricks - Keyboard Shortcuts : Minimize All Windows, Lock Computer
    You Can Master RAW (Part 2)
    Gaming Headset Shootout (Part 1) : Asus republic of gamers Orion Pro, Roccat Kave 5.1
    A brief history of transforming robots (Part 2)
    Hit List: Current iPad And iPhone Line-Ups (Part 2)