Resolving disk I/O bottlenecks
With the high-speed disks available today, a system’s hard
disks are rarely the primary reason for a bottleneck. It is more
likely that a system is having to do a lot of disk reads and writes
because there isn’t enough physical memory available and the system
has to page to disk. Because reading from and writing to disk is
much slower than reading and writing memory, excessive paging can
degrade the server’s overall performance. To reduce the amount of
disk activity, you want the system to manage memory as
efficiently as possible and page to disk only when necessary.
That said, you can do several things with a system’s hard
disks to improve performance. If the system has faster drives than the
ones used for the paging file, you might consider moving the paging
file to those disks. If the system has one or more drives that are
doing most of the work and other drives that are mostly idle, you
might be able to improve performance by balancing the load across
the drives more efficiently.
To help you better gauge disk I/O activity, use the following counters:
-
PhysicalDisk\%Disk
Time Records the percentage of time the physical disk
is busy. Track this value for all hard disk drives on the system in conjunction with
Processor\%Processor Time and Network Interface Connection\Bytes
Total/Sec. If the %Disk Time value is high and the processor and
network connection values aren’t high, the system’s hard disk
drives might be creating a bottleneck. You might be able to
improve performance by balancing the load across the drives more
efficiently or by adding drives and configuring the system so
that they are used.
Note
Redundant array of independent disks (RAID) devices can
cause the PhysicalDisk\%Disk Time value to exceed 100 percent.
For this reason, don’t rely on PhysicalDisk\%Disk Time for
RAID devices. Instead, use PhysicalDisk\Current Disk Queue Length.
-
PhysicalDisk\Current Disk Queue
Length Records the number of system requests that are
waiting for disk access. A high value indicates that the disk
waits are affecting system performance. In general, you want
there to be very few waiting requests.
Note
Physical disk queue lengths are relative to the number
of physical disks on the system and proportional to the length
of the queue minus the number of drives. For example, if a
system has two drives and there are 6 waiting requests, that
could be considered a proportionally large number of queued
requests; but if a system has eight drives and there are 10
waiting requests, that is considered a proportionally small
number of queued requests.
-
PhysicalDisk\Avg. Disk Write Queue
Length Records the number of write requests that are
waiting to be processed. -
PhysicalDisk\Avg. Disk Read Queue
Length Records the number of read requests that are
waiting to be processed. -
PhysicalDisk\Disk
Writes/Sec Records the number of disk writes per second. It
is an indicator of how much disk I/O activity there is. By tracking the number of
writes per second and the size of the write queue, you can
determine how write operations are affecting disk performance.
If lots of write operations are queuing and you are using RAID
5, it could be an indicator that you would get better
performance by using RAID 1. Remember that by using RAID 5 you
typically get better read performance than with RAID 1. So,
there’s a tradeoff to be made by using either RAID
configuration. -
PhysicalDisk\Disk
Reads/Sec Records the number of disk reads per second. It
is an indicator of how much disk I/O activity there is. By tracking the number
of reads per second and the size of the read queue, you can
determine how read operations are affecting disk performance. If
lots of read operations are queuing and you are using RAID 1, it
could be an indicator that you would get better performance by
using RAID 5. Remember that by using RAID 1 you typically get
better write performance than RAID 5. So, as mentioned, there’s
a tradeoff to be made by using either RAID configuration.
Resolving network bottlenecks
The network that connects your computers is critically
important. Its responsiveness, or lack thereof, weighs heavily on the
way users perceive the responsiveness of their computers and any
computers to which they connect. It doesn’t matter how fast their
computers are or how fast your servers are. If there’s a big delay
(and big network delays are measured in tens of milliseconds)
between when a request is made and the time it’s received, users
might think systems are slow or nonresponsive.
Unfortunately, in most cases, the delay (latency) users experience is beyond your control. It’s
a function of the type of connection the user has and the route the
request takes to your server. The total capacity of your server to handle requests and the
amount of bandwidth available to your servers are factors you can
control, however. Network capacity is a function of the network
cards and interfaces configured on the servers. Network bandwidth availability is a function of your
organization’s network infrastructure and how much traffic is on it
when a request is made.
Counters you can use to check network activity and look for bottlenecks include the following:
-
Network Interface\Bytes
Total/Sec Records the rate at which bytes are sent and
received over a network adapter. Track this value separately for
each network adapter configured on the system. If the Bytes
Total/Sec for a particular adapter is substantially slower than
what you’d expect given the speed of the network and the speed
of the network card, you might want to check the network card
configuration. Check to see whether the link speed is set for
half duplex or full duplex. In most cases, you’ll want to use
full duplex. -
Network Interface\Current
Bandwidth Estimates the current bandwidth for the selected
network adapter in bits per second. Track this value separately
for each network adapter configured on the system. Most servers
use 100-Mbps, 1-Gbps, or 10-Gbps network cards, which can be
configured in many ways. Someone might have configured a 1-Gbps
card for 100 megabits per second (Mbps). If that is the case,
the current bandwidth might be off by a factor of 10. -
Network Interface\Bytes
Received/Sec Records the rate at which bytes are received over
a network adapter. Track this value separately for each network
adapter configured on the system. -
Network Interface\Bytes
Sent/Sec Records the rate at which bytes are sent over a
network adapter. Track this value separately for each network
adapter configured on the system.
You might be able to improve network performance by installing
multiple network adapters and teaming the network
cards. You configure NIC teaming using Server Manager by selecting Local
Server in the left pane and then tapping or clicking the link
provided for NIC teaming. You can then create and configure NIC
teams.
|