System Center Configuration Manager 2007 : Developing the Solution Architecture (part 3) - Developing the Server Architecture

9/6/2012 1:58:44 AM

Developing the Server Architecture

CPU, RAM, and disk I/O are the three most important items when planning and configuring server hardware. The size, or robustness, of the server provisioned for any given role dictates how well it will handle the load. When discussing expectations of the overall solution, some level of understanding needs to be communicated and agreed on. ConfigMgr has many dependencies, including business and user requirements in addition to the overall infrastructure and network services requirements. This makes it difficult to predict expectations for the overall solution. Because each environment is different and has different requirements, there is no “one size fits all” solution.

Database Servers

The site database server is the most memory-intensive role in the ConfigMgr hierarchy. The amount of memory used is configurable in SQL Server and limited to 3GB, unless you are running SQL Server on an x64 platform and operating system. If SQL Server will require more than 3GB, as in instances when it is not dedicated to ConfigMgr, using a separate SQL Server running on x64 becomes a compelling solution.

Several counters are listed in Table 4 that you will want to evaluate on your ConfigMgr database server.

Table 4. Site Database Server Counters to Be Monitored
Physical DiskAvg. Disk Queue Length: VolumeSelect one of these counters for each volume involved in Configuration Manager processes. This includes the operating system installation volume, ConfigMgr installation (inbox) volume, as well as the SQL Server tempdb, site database, and log volumes.
Physical Disk% Disk TimeSelect one of these counters for each volume that is involved in Configuration Manager data processing. These include the operating system installation volume, ConfigMgr installation (inbox) volume, SQL Server tempdb, site database, and log volumes.
SqlServer:General StatisticsTemp Tables Creation RateGeneral SQL Server statistics.
SqlServer:General StatisticsLogouts/secGeneral SQL Server statistics.
SqlServer:General StatisticsLogins/secGeneral SQL Server statistics.
SqlServer:SQL StatisticsSQL Re-Compilations/secGeneral SQL Server statistics.
SqlServer:SQL StatisticsSQL Compilations/secGeneral SQL Server statistics.
SqlServer:SQL StatisticsBatch Requests/secGeneral SQL Server statistics.
SqlServer:Memory ManagerLock MemoryGeneral SQL Server statistics.
SQLServer:LocksLock Requests/secGeneral SQL Server statistics.
SQLServer:LocksNumber of Deadlocks/secGeneral SQL Server statistics.
SqlServer:DatabasesTransactions/sec: SCCM dbGeneral SQL Server statistics.
SqlServer:DatabasesTransactions/sec: Wsus DbGeneral SQL Server statistics.

You will also want to understand some basic SQL Server best practices. Some of these options will vary depending on your site size, hierarchy, which roles you are using, and how you are using them.

Microsoft has also produced a SQL Server 2005 Best Practice Analyzer (BPA) that gathers data from SQL Server configuration settings. The SQL BPA produces a report using predefined recommendations to determine if there are issues with the SQL Server implementation.

General Performance

There is no ideal performance or target goal for a given ConfigMgr solution. Cost/benefit analysis should be performed to weigh the performance cost versus the actual requirements.

Across any ConfigMgr role, it is important to understand the overall load the role places on a system, and how the system will handle that load. Table 5 illustrates a general array of performance counters system administrators should be aware of and use to gauge the overall performance, or health, of their systems. These counters are not specific to servers or roles, and can be applied to any Microsoft Windows operating system.

Table 5. General System Performance Counters
System% Total Processor TimeN/ALess than 80% is acceptable. Consistently exceeding that level means more CPU is needed or the load needs to be reduced.
SystemProcessor Queue LengthN/ATwo or fewer means the CPU utilization is acceptable.
ThreadContext Switches/sec_totalLower is better. Measure the thread counter to enable the processor queue length counter.
Physical Disk%Disk TimeEach diskLess than 80% is acceptable.
Physical DiskCurrent Disk Queue LengthEach diskTells you how many I/O operations are waiting for the hard disk to become available. Opinions vary widely on this one; the common rule is to multiply the number of spindles in the array by two and make sure the value stays below this.
MemoryCommitted BytesN/AShould be less than the installed RAM.
MemoryPage Reads/secN/AIf consistently exceeding 5, add RAM.
SQL ServerCache Hit RatioN/A98% or more is acceptable; lower means SQL is being delayed by paging.

Disk Performance

Disks today are the weakest point in a computer’s performance, and you will want to give serious attention to designing the right disk subsystem for the various ConfigMgr roles. Due to the increasing demand to lower server prices, vendors now make server systems available using hardware designed for the desktop-level system. This may lead to performance bottlenecks and disk failures with ConfigMgr site systems. Although performance using SCSI (Small Computer System Interface) devices may be adequate for server specs, technologies such as SATA (Serial Advanced Technology Attachment) have a much higher Mean Time Between Failure (MTBF), which is calculated during Phase 2 of a hard drive’s life.

It is important to understand the implications of drive failure in servers. Although a drive may fail and the system may continue to run, if another drive fails, the entire volume goes down, ultimately creating an outage. If you are dealing with an enterprise environment, outages are never welcome. Here are the three phases of a drive’s life:

  • Phase 1 of a drive’s life is the burn-in phase, and failure is very high.

  • In phase 2, the drive is run for a length of time and the failure rate is minimal. This equates to the normal operational lifetime of the drive and is how the MTBF value is calculated.

  • Phase 3 is where failure rates increase and the drive is reaching the end of its life, or warranty (ironically).

Table 6 lists characteristics of several types of drives.

Table 6. Disk Drive Characteristics
Drive TypeRotational SpeedAverage Seek/Access TimeMTBF
EIDE5400–7500 rpmSeek time: 8 to 10 ms300,000–500,000 hours at 20% duty cycle.
SCSI7500–15000 rpmAccess time: 5 to 10 ms600,000–1,200,000 hours at 100% duty cycle.
SATA5400–10000 rpmAccess time: 3 to 7 msMostly less expensive drives than SCSI, and MTBF’s defined for less than 100% duty cycle. MTBF 500,000–1,500,000 hours.

Real World: ConfigMgr and Disk I/O

Disk I/O is the biggest performance bottleneck on ConfigMgr implementations and can have a large impact on overall site health. When a site cannot keep up with client demands, a snowball effect occurs—unless the load decreases, the server cannot catch up and performance continuously deteriorates.

Modern-day best practices for disk architecture include the following:

  • Use SCSI or SAS devices when possible.

  • Use hardware RAID (Redundant Array of Independent Disks) instead of software RAID. Software RAID uses the CPU of the server, taking away from its ability to process computations.

  • Use battery-backed cache controller cards. This allows the disks to run at a higher performance level due to the lack of corruption from possible power loss.

  • More spindles with smaller size are better than fewer spindles of larger disks.

  • Utilize eight or more drives in RAID 1+0 when serious I/O or performance concerns are present.

  • Make sure you have adequate network bandwidth to support data transfers. As an example, it takes 2.5Gbps to equate to the transfer rate of a SCSI Ultra 320 drive.

Smaller sites may be able to run sufficiently on a small array, such as a RAID1 array, which uses two disks. However, larger implementations will falter on such a small backend disk subsystem. As scale increases in the enterprise or demand increases on the disk subsystem used by ConfigMgr, a larger array becomes necessary to support the load. Unfortunately, there is no formula where x number of ConfigMgr clients equals y number of disks—there are just too many possible implementation paths in ConfigMgr 2007 to allow a standard formula to dictate disk I/O load.

When dealing with larger enterprises or more aggressive policy evaluation intervals, such as daily or hourly inventories, know that adding spindles always increases performance of the disk subsystem. Arrays composed of many disks will yield exponentially better performance than arrays just several disks smaller. An easy way to understand this is thinking of each disk as a worker going to find information. When there are additional workers, the information is returned quicker.

If ConfigMgr console performance is important to your ConfigMgr administrators, you will want to explore SANs (Storage Area Networks) and other storage solutions for the ConfigMgr database and binaries. Although discussing SANs, iSCSI (Internet SCSI), you will want to explore them in large-scale enterprises with 20,000 or more clients reporting to a site server. This does not imply that if there are fewer than 20,000 clients that you should not look at using a SAN for your SQL Server databases or distribution points. Storage solutions offer a variety of other benefits, including disaster recovery, backup, and other options that are frequently vendor specific.

Disk optimization steps include the following:

  • The ConfigMgr SQL database should be on its own array.

  • The ConfigMgr SQL transaction log should be on its own array.

  • The Windows operating system should be on its own array.

  • Any distribution point should be on its own array.

  • Any software update point should be on its own array.

Operating systems perform best when loaded on RAID 1 arrays. Consult with your company’s standard on whether you use RAID1+0 or external storage solutions such as SANs. The principle here is that two disks give good performance, redundancy, and the lowest possible failure rate. (That is correct. With only two disks in a RAID1 array, you are four times less likely to have a failure than in a RAID5 array with eight disks!)

Databases typically need to be placed on RAID5 or RAID10 arrays, due to the sheer number of disks required to support the database size. Fortunately, ConfigMgr has a relatively small database size, although its size is dependent on a multitude of variables such as inventories, packages, number of clients, features in use, and such. SMS 2003’s SQL sizing was based on 50MB + (N × 250KB), where N is the number of clients. This means that if there were 5,000 clients, the formula would read as follows:

50MB + (5,000 × 250KB) = 1.27GB

This sizing formula was found to be unrealistically low, and most administrators doubled or tripled the value. With ConfigMgr 2007, you can use the same rule of thumb for database sizing, but should increase the 250KB multiplier to support the new features, including patch management, configuration management, and expanded inventory. Experience has shown that 2MB per client is a more realistic value to use than 250KB as a starting point for sizing the ConfigMgr database. This means you should use the following formula to determine the required database size:

50MB + (N × 2,048KB), where N is the number of clients

Using this new formula for the same size (5,000 clients) gives a considerably higher number:

50MB + (5,000 × 2,048KB) = 9.8GB

SQL Server transaction logs can usually be a RAID1 array because it is not common for ConfigMgr requirements to do point-in-time restores. This means selecting a simple database recovery model, so the transaction log will not need to be extraordinarily large.

DPs and state migration points are the most critical in terms of disk I/O. Memory and CPU on these roles are a minor concern, and are not an issue as long as there’s sufficient RAM on the system to prevent unnecessary swapping.

Distribution points can have the most widely varying requirements depending on how they are used. As an example, if a company performs routine software patching and package pushes, the size of its distribution point may be minimal, particularly if BITS is used extensively to download and execute content. Anything from a single disk to a RAID1 array could be effective in a branch DP or a conventional DP.

If you introduce the OSD functionality into your ConfigMgr solution, the requirements jump substantially. Conventional packages are relatively small, between 1MB and 200MB, depending on the average package. Microsoft Office usually is one of the largest at 1GB for the 2007 version. Operating system images, regardless of the applications being in the images or called from outside them, average around 1GB for Windows XP and 3GB to 4GB for Vista images in the ImageX WIM (Windows Imaging) format. In addition, because download-and-execute is not an option for operating system deployments, you can have a DP with a very large data demand for many machines in parallel. The best solution for this scenario is many disk spindles. You should seriously consider SANs if your ConfigMgr implementation requires supporting large operating system deployments. State migration points may have similar disk I/O requirements. Disk I/O for this role is difficult to calculate, with each user’s state volume size being an unknown.

Tip: Calculating User State Volume

You can use ConfigMgr to calculate user state volume size, thus helping to define capacity requirements and expected timeframes for OS migrations. Simply query the user’s dataset you desire to capture running a script deployed as a package, and store the size in Windows Management Instrumentation (WMI) on the client. The next inventory will upload data to the site server, where it can be used to populate reports. Microsoft partners such as SCCM Experts (formerly known as SMS Experts) specialize in solutions such as this.

Monitoring Performance

If available, utilize tools such as System Center Operations Manager (OpsMgr) 2007 to baseline performance and monitor ConfigMgr site health. When external monitoring solutions are not available, use a tool such as Performance Monitor (Perfmon), which is built into each version of Windows. Perfmon enables administrators to collect a myriad of performance data and log it to a file for later analysis. Realize that this method of using Performance Monitor can place a load on the system when the samples are captured at an aggressive interval! Because you only need to look at average performance over a broad period, sampling every 10 or 15 minutes is acceptable and provides a multitude of useful data to analyze when tuning the system.

Tip: Benchmarking

Consider periodically collecting performance metrics from the site systems when they are utilized during business hours. This data ultimately will provide a baseline by which you can measure performance. This data is useful for scaling out or up, depending on how the load increases on the site systems. CPU, memory, disk, and network throughput are the four areas to evaluate periodically.

  •  System Center Configuration Manager 2007 : Operating System Deployment Planning, Out of Band Management Planning
  •  Visual Studio 2010 IDE : Customizing Visual Studio 2010
  •  Visual Studio 2010 IDE : Exporting Templates
  •  System Center Configuration Manager 2007 : Certificate Requirements Planning, Windows Server 2008 Planning
  •  System Center Configuration Manager 2007 : Planning for Internet-Based Clients
  •  Active Directory Domain Services 2008 : Automatically Populate a Migration Table from a Group Policy Object
  •  Active Directory Domain Services 2008 : Create a Migration Table
  •  Microsoft Content Management Server : Developing Custom Properties for the Web Part
  •  Microsoft Content Management Server : Building SharePoint Web Parts - Creating the Web Part, Defining Custom Properties for the Web Part
  •  Microsoft Content Management Server : Building SharePoint Web Parts - The SharePoint MCMS Navigation Control, Creating the Web Part Project
  •  Active Directory Domain Services 2008 : Search Group Policy Objects
  •  Active Directory Domain Services 2008 : Export a Starter GPO, Import a Starter GPO
  •  The Very Successful Hardware That Microsoft Has Ever Produced
  •  Xen Virtualization - Managing Xen : Virtual Machine Manager
  •  Xen Virtualization - Managing Xen : XenMan—Installing and Running
  •  Dual-Core Or Quad-Core?
  •  IBM WebSphere Process Server 7 and Enterprise Service Bus 7 : Getting Started with WID (part 2) - Working with Modules and Libraries
  •  IBM WebSphere Process Server 7 and Enterprise Service Bus 7 : Getting Started with WID (part 1) - Business Integration perspective
  •  Google vs. Apple vs. Microsoft
  •  DSLs in Boo : Implementing the Scheduling DSL
    Top 10
    Nikon 1 J2 With Stylish Design And Dependable Image And Video Quality
    Canon Powershot D20 - Super-Durable Waterproof Camera
    Fujifilm Finepix F800EXR – Another Excellent EXR
    Sony NEX-6 – The Best Compact Camera
    Teufel Cubycon 2 – An Excellent All-In-One For Films
    Dell S2740L - A Beautifully Crafted 27-inch IPS Monitor
    Philips 55PFL6007T With Fantastic Picture Quality
    Philips Gioco 278G4 – An Excellent 27-inch Screen
    Sony VPL-HW50ES – Sony’s Best Home Cinema Projector
    Windows Vista : Installing and Running Applications - Launching Applications
    Most View
    Bamboo Splash - Powerful Specs And Friendly Interface
    Powered By Windows (Part 2) - Toshiba Satellite U840 Series, Philips E248C3 MODA Lightframe Monitor & HP Envy Spectre 14
    MSI X79A-GD65 8D - Power without the Cost
    Canon EOS M With Wonderful Touchscreen Interface (Part 1)
    Windows Server 2003 : Building an Active Directory Structure (part 1) - The First Domain
    Personalize Your iPhone Case
    Speed ​​up browsing with a faster DNS
    Using and Configuring Public Folder Sharing
    Extending the Real-Time Communications Functionality of Exchange Server 2007 : Installing OCS 2007 (part 1)
    Google, privacy & you (Part 1)
    iPhone Application Development : Making Multivalue Choices with Pickers - Understanding Pickers
    Microsoft Surface With Windows RT - Truly A Unique Tablet
    Network Configuration & Troubleshooting (Part 1)
    Panasonic Lumix GH3 – The Fastest Touchscreen-Camera (Part 2)
    Programming Microsoft SQL Server 2005 : FOR XML Commands (part 3) - OPENXML Enhancements in SQL Server 2005
    Exchange Server 2010 : Track Exchange Performance (part 2) - Test the Performance Limitations in a Lab
    Extra Network Hardware Round-Up (Part 2) - NAS Drives, Media Center Extenders & Games Consoles
    Windows Server 2003 : Planning a Host Name Resolution Strategy - Understanding Name Resolution Requirements
    Google’s Data Liberation Front (Part 2)
    Datacolor SpyderLensCal (Part 1)