4. MEASURING PERFORMANCE
The single most important performance
metric is latency. Latency is a measure of system health and the
availability of system resources. Latency is governed by queuing
theory, a mathematical study of lines, or queues. An important
contribution to queuing theory, now known as Little’s Law, was
introduced in a proof submitted by John D. C. Little (http://or.journal.informs.org/content/9/3/383) in 1961.
Put simply, Little’s Law states that given a
steady-state system, as capacity reaches maximum performance, response
time approaches infinity. To understand the power of Little’s Law,
consider the typical grocery store. If the store only opens one cash
register and ten people are waiting in line, then you are going to wait
longer to pay than if the store opened five or ten cashiers.
Storage systems are directly analogous to a
grocery store checkout line. Each component has a certain performance
maximum. Driving the system toward maximum performance will increase
latency. We have found that most users and application owners directly
correlate latency with failure. For example, it won’t matter that a
payroll system is online if it can’t process transactions fast enough
to get everyone’s paycheck out on time.
You test I/O performance using several tools that
are described later. You test storage using a logarithmic scale,
starting with one I/O, moving to two, then four, then eight, and
finally peaking at 256 I/Os that are all sent to storage in parallel
(see Figure 1). As it turns out, this test perfectly demonstrates Little’s Law by defining how storage operates.
As you can see in Figure 11
the storage response time remains less than our goal of 10 milliseconds
through eight outstanding I/Os. As we increase the workload to 16
outstanding I/Os, the latency increases to 20 milliseconds. We can
determine from this test that our configuration is optimal when we
issue between 8 and 16 I/Os. This is called the knee of the curve. The system is capable of a lot more work, but the latency is higher than our tolerance.
NOTE
The goal of
less than 10 milliseconds of latency is no accident. SQL Server best
practices call for no more than 20 milliseconds of delay. If you
implement synchronous data replication with AlwaysOn Availability
Groups, you can’t tolerate more than 10 milliseconds of delay; and many
applications are even more sensitive!
4.1 Storage Performance Counters
Windows Performance Monitor (perfmon)
allows Windows Server users to capture storage performance metrics. For
the purposes of storage monitoring, you utilize the LogicalDisk
performance monitor object. Both logical and physical disk counters
deliver storage performance metrics. The logical disk counters show the
performance of a specific partition, while the physical disk counters
cover the entire LUN (a Logical Unit Number is a term that describes a
storage volume that is hosted by a storage controller). Table 1 shows a list of the available Windows storage performance counters.
TABLE 1: Windows Storage Performance Counters
LOGICALDISK PERFMON OBJECT |
Average Disk sec/Read |
Average Disk sec/Write |
Disk Reads/sec |
Disk Writes/sec |
Disk Read Bytes/sec |
Disk Write Bytes/sec |
The Average Disk Sec/Read and Write counters
measure the time it takes for an input output (I/O) operation to be
sent from the server to the storage system and back. This latency
measure is the single biggest indicator of I/O system health. Reads and
writes are treated separately because most storage systems perform one
operation faster than the other. If you are using a storage array with
a battery-backed cache, it will often write in just a few milliseconds,
whereas a random read will take longer.
The latency counters are measured in
milliseconds. A reading of 0.001 is one millisecond, 0.010 is 10
milliseconds, and .100 is 100 milliseconds. SQL Server best practices
call for latency that is under 20 milliseconds. This is not a
hard-and-fast rule, however, as many applications will not tolerate
latency that exceeds several milliseconds.
It is important to understand the underlying
hardware configuration, application, and workload. In some cases, such
as a SQL Server standard backup, large I/O sizes will drastically
increase latency. If you change the backup I/O size to 8MB, the latency
will increase, but you can still achieve a lot of work.
If you are implementing a specialized system,
such as a SQL Server Fast Track Data Warehouse, you will actively
configure the data files so they issue sequential I/O. Be sure to test
your specific configuration so you can properly interpret the results.
The Disk Reads and Writes per second counters
list how many I/Os are generated each second (Storage administrators
often refer to this as IOPS). Disk Read and Write Bytes per second
demonstrate the throughput of your storage system. To calculate average
I/O sizes, simply divide bytes per second by the number of operations
per second.
Knowing the size of the I/O can reflect
application behavior. When performing highly random I/O access, SQL
Server will write 8K data pages and read 64KB data extents from the
data files. Performing sequential operations, such as a table scan,
will generate I/O that is dynamically sized from 8K to 512KB. Dynamic
I/O sizing, also known as Read-Ahead, is one of the hidden gems of SQL
Server. Increasing I/O size decreases the number of I/Os and increases
efficiency.
4.2 Disk Drive Performance
A disk drive (see Figure 12)
is made up of an external logic board and an internal hard drive
assembly. The logic board provides connectivity between the disk and
the host. Each drive interface supports one of many available
communications protocols. Modern interfaces use a high-speed serial
connection. The interfaces that are most commonly used for database
applications are SATA (Serial Advanced Technology Attachment), FC
(Fibre Channel), and SAS (Serial Attached SCSI).
The hard drive assembly is serviceable only
inside a high-technology clean room. Opening the cover on a hard disk
drive will void the warranty. The drive platter rotates around a
spindle and is powered by a spindle motor. A drive is made up of
several platters that are stacked. Each platter is double-sided and
coated with a magnetic oxide.
Data is physically read and written by a hard
drive head. Each drive platter has a dedicated drive head. An actuator
arm houses all the drive heads, and a magnetic actuator moves the arm.
You can think of a hard drive as a record player. The platter spins and
the head reads and writes data. Unlike a record player, however, the
disk drive head can move back and forth. The head actually rides just
above the disk surface on a cushion of high-pressure air that is
created when the platter spins at high speed.
SATA disk drives provide commodity storage. They
offer much larger capacity than FC or SAS drives. At the time of this
writing, SATA drives are available with a capacity of three terabytes
(3TB). SATA drives spin at lower speeds — 5,400 to 7,200 RPM. They are
sold for both the consumer and the enterprise markets. Enterprise
drives are designed for more continuous use and higher reliability.
Both FC and SAS drives are considered
enterprise-class disk drives. They are available in 7,200, 10,000, and
15,000 RPM models. With the exception of Nearline SAS (NL-SAS) drives,
these disk drives are generally lower in capacity than SATA disk
drives. SAS 6GB/s drives are displacing Fibre Channel 4GB/s drives in
the marketplace.
Modern SAS drives are manufactured in a 2.52 form factor, unlike the traditional 3.52
form factor of Fibre Channel and SATA drives. This smaller drive
enables more disk drives to be housed in a given space. NL-SAS drives
offer a high-reliability enterprise SATA drive with a SAS interface.
The logic board governs how the disk operates.
Each disk contains buffers, and some disk drives contain cache. For the
proper operation of SQL Server write-ahead logging, volatile cache must
be bypassed for write operations. Most array vendors will guarantee
cached data with battery backing. When using disks directly in a JBOD,
it is important to ensure that they meet SQL Server reliability
requirements.
A disk drive is made of both electronic and
mechanical components. When data is read sequentially, the drive can
read it as fast as the drive spins. When the data needs to be accessed
out of order, the head needs to move to the appropriate track. Head
movement, or seeking, is not instantaneous.
A sequential read consists of a head movement
followed by the sequential reading or writing of data. When data is not
sequentially located, the disk drive executes a series of random
operations. Random access patterns are much slower because the head
needs to physically move between tracks.
Drive manufacturers provide a measurement called maximum seek time
that reflects how long it will take for the drive head to move from the
innermost tracks to the outermost tracks. The manufacturer also
provides what is known as average seek time, the average time it will take to move the drive head to any location on the disk.
Disk Drive Latency
You can calculate the time it takes to
move the head to a location on the disk mathematically. The number of
times a disk can rotate in a millisecond limits the amount of data the
drive can generate, a limitation called rotational latency. To calculate how many random I/Os a hard disk can perform, the following equation is used:
This equation works by normalizing all
calculations to milliseconds. To find IOPS, you start by dividing
60,000 (because there are 60,000 milliseconds in a minute) by the hard
disk rotations per minute. Dividing the revolutions per millisecond by
2 accounts for the fact that the head needs to exit the first track and
enter the second track at specific points, requiring about two
rotations. You add the revolutions result to the seek time and convert
the result back to seconds by dividing 1,000 by this sum.
For example, consider a 10,000-RPM disk. This
drive will rotate about 6 times per second. You account for the fact
that it will take your drive two rotations to move between tracks and
then add the seek time. This drive has a read seek time of 4
milliseconds. Dividing 1,000 by 7 results in 143 I/Os per second:
We have tested many drives over the years and
this formula has proven reliable. Remember that each individual model
of disk drive varies. Having said that, you can calculate IOPS for the
most frequently used disk drives:
If you need more IOPS, then simply add more
disks. If one disk will perform 150 IOPS, then two will perform 300.
When you need 10,000 IOPS, you only need 54 physical disks. If you
actually want to keep the data when one of the drives fails, then you
need 108 disks. Those 108 disks will provide 10,000 IOPS when the
database needs to read, but only 5,000 IOPS for writes. RAID causes
overhead for both space and performance. RAID 1+0 is fairly easy to
calculate. You will receive N number of reads and N divided by 2 writes. RAID 5 is much trickier to calculate.
To generate the parity information, a RAID
controller reads relevant data and performs an XOR calculation. Let’s
take a small RAID set of four disks as an example. One write operation
will generate two writes and two reads. We are assuming that there are
two existing data chunks and we are writing the third. We need to write
the resultant parity and the new data block.
RAID controllers vary greatly in design, but
generally speaking, they utilize their internal cache to assist in the
generation of parity information. Typically, Raid 5 enables N number of reads and N divided by 4 writes.
RAID 6 protects against double disk failure and
therefore generates double the parity. An 8-disk RAID set consists of
two parity chunks and six data chunks. You need to write the new data
chunk and two parity chunks, so you know that you have three writes.
You need to read the other five data chunks, so you are looking at
eight operations to complete the RAID 6 write. Luckily, most RAID
controllers can optimize this process into three reads and three writes.
Table 2 provides a guide to calculating common RAID overhead. Please remember that each system is different and your mileage may vary.
TABLE 2: RAID Overhead
RAID TYPE |
READ |
WRITE |
0 |
N |
N |
1+0 |
N |
N ÷ 2 |
5 |
N |
N ÷ 4 |
6 |
N |
N ÷ 6 |