SQL Server 2008 : Storage system sizing - Characterizing I/O workload & Determining the required number of disks and controllers

9/27/2011 5:09:28 PM

1. Characterizing I/O workload

To determine an application's ideal storage system and disk quantity, it's important to understand the type and volume of I/O the application will generate. This section focuses on the different types of I/O, the metrics used to classify workload, and methods used in measuring and estimating values for the I/O metrics.

1.1. OLTP vs. OLAP/DSS

When classifying the nature of I/O, two main terms are used: OLTP and OLAP. An example of an OLTP database is one that stores data for a point-of-sales application, typically consisting of a high percentage of simple, short transactions from a large number of users. Such transactions generate what's referred to as random I/O, where the physical disks spend a significant percentage of time seeking data from various parts of the disk for read and/or write purposes.

In contrast, as shown in figure 1, an OLAP, or decision support system (DSS), database is one that stores data for reporting applications that usually have a smaller number of users generating much larger queries. Such queries typically result in sequential I/O, where the physical disks spend most of their time scanning a range of data clustered together in the same part of the disk. Unlike OLTP databases, OLAP databases typically have a much higher percentage of read activity.

Figure 1. Characterizing I/O workload is a crucial prerequisite in designing the appropriate storage system. Unlike OLTP, OLAP systems typically consist of fewer but larger sequential read dominant transactions.

Note that even for classic OLTP applications such as point-of-sales systems, actions like backups and database consistency checks will still generate large amounts of sequential I/O. For the purposes of I/O workload classification, we'll consider the main I/O pattern only.

As you'll see a little later, the difference between sequential and random I/O has an important bearing on the storage system design.

1.2. I/O metrics

To design a storage system for a database application, in addition to knowing the type of workload it produces (OLTP or OLAP), we need to know the volume of workload, typically measured by the number of disk reads and writes per second.

The process of obtaining or deriving these figures is determined by the state of the application. If the application is an existing production system, the figures can be easily obtained using Windows Performance Monitor. Alternatively, if the system is yet to be commissioned, estimates are derived using various methods. A common one is to use a test environment to profile the reads and writes per transaction type, and then multiply by the expected number of transactions per second per type.

Existing systems

We'll focus on the methods used to collect and analyze system bottlenecks, including disk I/O bottlenecks. For the purposes of this section, let's assume that disk I/O is determined to be a significant bottleneck and we need to redesign the storage system to correct it. The task, then, is to collect I/O metrics to assist in this process.

We can use the Windows Performance Monitor tool to collect the disk I/O metrics we need. For each logical disk volume—that is, the drive letter corresponding to a data or log drive—the following counters, among others, can be collected:

PhysicalDisk—Disk reads per second
PhysicalDisk—Disk writes per second

Note that the averages of these values should be collected during the database's peak usage period. Designing a system based on weekly averages that include long periods of very low usage may result in the system being overwhelmed during the most important period of the day.

In the next section, we'll use these values to approximate the number of physical disks required for optimal I/O performance.

New systems

For a system not yet in production, application I/O is estimated per transaction type in an isolated test environment. Projections are then made based on the estimated maximum number of expected transactions per second, with an adjustment made for future growth.

Armed with these metrics, let's proceed to the next section, where we'll use them to project the estimated number of disks required to design a high-performance storage system capable of handling the application load.

2. Determining the required number of disks and controllers

In the previous section, we covered the process of measuring, or estimating, the number of database disk reads and writes generated by an application per second. In this section, we'll cover the formula used to estimate the number of disks required to design a storage system capable of handling the expected application I/O load.

Note that the calculations presented in this section are geared toward direct-attached storage (DAS) solutions using traditional RAID storage. Configuring SAN-based Virtualized RAID (V-RAID) storage is a specialist skill, and one that differs among various SAN solutions and vendors. Therefore, use the calculations presented here as a rough guideline only.

2.1. Calculating the number of disks required

In calculating the number of disks required to support a given workload, we must know two values: the required disk I/O per second (which is the sum of the reads and writes that we looked at in the previous section) and the I/O per second capacity (IOPS) of the individual disks involved.

The IOPS value of a given disk depends on many factors, such as the type of disk, spin speed, seek time, and I/O type. While tools such as SQLIO, can be used to measure a disk's IOPS capacity, an often-used average is 125 IOPS per disk for random I/O. Despite the fact that commonly used server class 15,000 RPM SCSI disks are capable of higher speeds,^[] the 125 IOPS figure is a reasonable average for the purposes of estimation and enables the calculated disk number to include a comfortable margin for handling peak, or higher than expected, load.

^[] Take the manufacturer's published specifications with a grain of salt.

Storage virtualization

The process of selecting RAID levels and calculating the required number of disks is significantly different in a SAN compared to a traditional direct-attached storage (DAS) solution. Configuring and monitoring virtualized SAN storage is a specialist skill. Unless already skilled in building SAN solutions, DBAs should insist on SAN vendor involvement in the setup and configuration of storage for SQL Server deployments. The big four SAN vendors (EMC, Hitachi, HP, and IBM) are all capable of providing their own consultants, usually well versed in SQL Server storage requirements, to set up and configure storage and related backup solutions to maximize SAN investment.

Here's a commonly used formula for calculating required disk numbers:

Required # Disks = (Reads/sec + (Writes/sec * RAID adjuster)) / Disk IOPS

Dividing the sum of the disk reads and writes per second by the disk's IOPS yields the number of disks required to support the workload. As an example, let's assume we need to design a RAID 10 storage system to support 1,200 reads per second and 400 writes per second. Using our formula, the number of required disks (assuming 125 IOPS per disk) can be calculated as follows:

Required # disks = (1200 + (400 * 2)) / 125 = 16 DISKS

Note the doubling of the writes per second figure (400 * 2); in this example, we're designing a RAID 10 volume, and as you'll see in a moment, two physical writes are required for each logical write—hence the adjustment to the writes per second figure. Also note that this assumes the disk volume will be dedicated to the application's database. Combining multiple databases on the one disk will obviously affect the calculations.

Although this is a simple example, it highlights the important relationship between the required throughput, the IOPS capacity of the disks, and the number of disks required to support the workload.

Finally, a crucial aspect of disk configuration, is the separation of the transaction log and data files. Unlike random access within data files, transaction logs are written in a sequential manner, so storing them on the same disk as data files will result in reduced transaction throughput, with the disk heads moving between the conflicting requirements of random and sequential I/O. In contrast, storing the transaction log on a dedicated disk will enable the disk heads to stay in position, writing sequentially, and therefore increase the transaction throughput.

Once we've determined the number of disks required, we need to ensure the I/O bus has adequate bandwidth.

Storage formats

Here are some typical storage formats used by SQL Server systems:

ATA—Using a parallel interface, ATA is one of the original implementations of disk drive technologies for the personal computer. Also known as IDE or parallel ATA, it integrates the disk controller on the disk itself and uses ribbon-style cables for connection to the host.

SATA—In widespread use today, SATA, or serial ATA, drives are an evolution of the older parallel ATA drives. They offer numerous improvements such as faster data transfer, thinner cables for better air flow, and a feature known as Native Command Queuing (NCQ) whereby queued disk requests are reordered to maximize the throughput. Compared to SCSI drives, SATA drives offer much higher capacity per disk, with terabyte drives (or higher) available today. The downside of very large SATA disk sizes is the increased latency of disk requests, partially offset with NCQ.

SCSI—Generally offering higher performance than SATA drives, albeit for a higher cost, SCSI drives are commonly found in server-based RAID implementations and high-end workstations. Paired with a SCSI controller card, up to 15 disks (depending on the SCSI version) can be connected to a server for each channel on the controller card. Dual-channel cards enable 30 disks to be connected per card, and multiple controller cards can be installed in a server, allowing a large number of disks to be directly attached to a server. It's increasingly common for organizations to use a mixture of both SCSI drives for performance-sensitive applications and SATA drives for applications requiring high amounts of storage. An example of this for a database application is to use SCSI drives for storing the database and SATA drives for storing online disk backups.

SAS—Serial Attached SCSI (SAS) disks connect directly to a SAS port, unlike traditional SCSI disks, which share a common bus. Borrowing from aspects of Fibre Channel technology, SAS was designed to break past the current performance barrier of the existing Ultra320 SCSI technology, and offers numerous advantages owing to its smaller form factor and backward compatibility with SATA disks. As a result, SAS drives are growing in popularity as an alternative to SCSI.

Fibre Channel—Fibre Channel allows high-speed, serial duplex communications between storage systems and server hosts. Typically found on SANs, Fibre Channel offers more flexibility than a SCSI bus, with support for more physical disks, more connected servers, and longer cable lengths.

Solid-state disks—Used today primarily in laptops and consumer devices, solid-state disks (SSDs) are gaining momentum in the desktop and server space. As the name suggests, SSDs use solid-state memory to persist data in contrast to rotating platters in a conventional hard disk. With no moving parts, SSDs are more robust and promise near-zero seek time, high performance, and low power consumption.

2.2. Bus bandwidth

When designing a storage system with many physical disks to support a large number of reads and writes, we must consider the ability of the I/O bus to handle the throughput.

As you learned in the previous section, typical OLTP applications consist of random I/O with a moderate percentage of disk time seeking data, with disk latency (the time between disk request and response) an important factor. In contrast, OLAP applications spend a much higher percentage of time performing sequential I/O—thus the throughput is greater and bandwidth requirements are higher.

In a direct-attached SCSI disk enclosure, the typical bus used today is Ultra320, with a maximum throughput of 320MB/second per channel. Alternatively, a 2 Gigabit Fibre Channel system offers approximately 400MB/second throughput in full duplex mode.

In our example of 2,000 disk transfers per second, assuming these were for an OLTP application with random I/O and 8K I/O transfers (the SQL Server transfer size for random I/O), the bandwidth requirements can be calculated as 2,000 times 8K, which is a total of 16MB/second, well within the capabilities of either Ultra320 SCSI or 2 Gigabit Fibre Channel.

Should the bandwidth requirements exceed the maximum throughput, additional disk controllers and/or channels will be required to support the load. OLAP applications typically have much higher throughput requirements, and therefore have a lower disk to bus ratio, which means more controllers/channels for the same number of disks.

You'll note that we haven't addressed storage capacity requirements yet. This is a deliberate decision to ensure the storage system is designed for throughput and performance as the highest priority.

2.3. A note on capacity

A common mistake made when designing storage for SQL Server databases is to base the design on capacity requirements alone. A guiding principle in designing high-performance storage solutions for SQL Server is to stripe data across a large number of dedicated disks and multiple controllers. The resultant performance is much greater than what you'd achieve with fewer, higher-capacity disks. Storage solutions designed in this manner usually exceed the capacity requirements as a consequence of the performance-centric approach.

In our previous example (where we calculated the need for 16 disks), assuming we use 73GB disks, we have a total available capacity of 1.1TB. Usable space, after RAID 10 is implemented, would come down to around 500GB.

If the projected capacity requirements for our database only total 50GB, then so be it. We end up with 10 percent storage utilization as a consequence of a performance-centric design. In contrast, a design that was capacity-centric would probably choose a single 73GB disk, or two disks to provide redundancy. What are the consequences of this for our example? Assuming 125 IOPS per disk, we'd experience extreme disk bottlenecks with massive disk queues handling close to 2,000 required IOPS!

While low utilization levels will probably be frowned upon, this is the price of performance, and a much better outcome than constantly dealing with disk bottlenecks. A quick look at any of the server specifications used in setting performance records for the Transaction Processing Performance Council (tpc.org) tests will confirm a low-utilization, high-disk-stripe approach like the one I described.

Finally, placing capacity as a secondary priority behind performance doesn't mean we can ignore it. Sufficient work should be carried out to estimate both the initial and future storage requirements. Running out of disk space at 3 a.m. isn't something I recommend!

In this section, I've made a number of references to various RAID levels used to provide disk fault tolerance. In the next section, we'll take a closer look at the various RAID options and their pros and cons for use in a SQL Server environment.

Others