Don't make the mistake of thinking that because SANs
are big and expensive you'll be guaranteed to get good I/O performance.
SAN storage design goals are often in conflict with those of SQL Server.
SANs are effective at maximizing disk utilization by sharing a large
central store of disks between many servers. In contrast, SQL Server
benefits from striping over dedicated disks, with an emphasis on disk
quantity rather than utilization.
This section addresses some
of the common issues that DBAs face in SAN-based environments,
including the relationship with the SAN administrator, LUN
configuration, performance tuning, and disaster-recovery options.
1. The SAN administrator
If you're like me, you
like being in control. You like having operating system administrator
privileges to all database servers, despite best practice, and you like
having direct control over disk configuration. In environments with
multiple servers and different applications sharing SAN storage, such
control is unlikely, particularly when the DBA and SAN administration
roles are separated. In such sites, it's not uncommon for the SAN
administrator to be dismissive of the DBA's concerns about SAN
configuration; often with a let the SAN take care of it attitude.
Like any other storage
system, SAN disks that are presented to SQL Server need to be configured
in line with the storage practices we've already covered. Given the
complex, shared nature of SANs and the difficulty of changing a design
after deployment, it's critical for you to become involved in the SAN
configuration as early as possible and present storage requirements from
a DBA's perspective.
I've been fortunate to
work in environments with highly skilled SAN administrators who were
receptive to the unique storage requirements of SQL Server. Through our
good working relationship, we were able to combine our skill sets to
deliver reliable, high-performance SAN storage for SQL Server.
Unfortunately, such outcomes aren't always achieved; the most common
problem is LUN configuration, which we'll look at next.
2. LUN configuration
A LUN is a logical unit of
SAN disk created by the SAN administrator and presented to an attached
server's operating system. The server is unaware of the physical makeup
of the disks involved in the LUN, and sees it as a single locally
attached disk.
As shown in figure 1,
each physical disk in the SAN can be carved up into parts and used in
the creation of separate LUNs. As a result, LUNs from many servers can
all be using different parts of the same physical disk.
When
troubleshooting performance problems involving SAN storage, I've found
it useful to ask the SAN administrator a series of questions involving
the makeup of the LUNs:
How many individual physical disks are included in the SQL Server LUNs?
Remembering the principle of striping across many physical disks, if a
LUN consists of a small number of physical disks, then performance may
be less than ideal.
What other servers are sharing the physical disks in the LUNs, and what is their I/O profile? If many servers' LUNs share the same physical disks, performance may be reduced.
This is particularly important for transaction log LUNs. Transaction
log I/O is sequential in nature, and dedicated physical disks mean the
disk heads are able to stay in position, with writes proceeding in a
sequential manner. This is obviously not possible if the transaction log
LUNs are created on disks containing other LUNs. For SQL Server
applications with high transaction log rates, this can have a large
impact on transaction response time, leading to decreased performance
and throughput.
What are the RAID levels of the LUNs? RAID 5 has a disk write overhead and is therefore not an ideal
choice for SQL Server applications with a high percentage of disk
writes. Given the SAN goal of increased disk utilization, RAID 5 is
often chosen as the default RAID level. The SAN administrator should be
able to tell you the current RAID level and the different levels
supported in the SAN.
Are my LUNs zoned?
Zoning is the process of matching disks and LUNs to particular servers
via storage array ports, increasing security and I/O bandwidth as a
result. In SANs with thousands of LUNs and attached servers, this is
particularly important in guaranteeing minimum service levels.
The answers to these
questions (or lack thereof in some cases) quickly establish the SAN
administrator's degree of skill and knowledge of the SAN in regard to
SQL Server's unique storage requirements. A correctly configured SAN is a
vital component in any performance-tuning process, our next subject.
3. Performance tuning
Like any other
storage system, performance testing a SAN before operational
commissioning is critical in establishing a degree of confidence in the
configuration. Unlike DAS, SANs have a number of specific (often
vendor-specific) configuration settings that require specialist
knowledge to derive maximum benefit.
Storage
virtualization and large amounts of cache often obscure disk performance
from tools such as Windows Performance Monitor. Therefore, it's wise to
involve storage administrators and/or SAN vendors with vendor-supplied
SAN monitoring tools and techniques to measure actual performance during any performance monitoring and baseline exercise.
Two commonly tuned
SAN settings are the storage cache and the HBA queue depth setting. The
SAN cache size is typically much larger than direct-attached cache, with
more options for adjusting the read/write percentage. Take care when
performance testing SANs containing large disk cache: make sure the
tests are large enough, and run for long enough, to exhaust the disk
cache and deliver real I/O performance metrics, not just those satisfied
from cache. Like standard disk controller cache, SAN cache should be
optimized for writes rather than reads.
HBA cards connecting the server to the SAN have a setting called Queue Depth
that governs the number of I/O requests that can be queued at a given
time. The default HBA queue depth setting is typically between 8 and 32.
For SQL Server deployments, a queue depth value of 32 or greater often
results in increased I/O performance, although changes to this setting
should be confirmed with the storage vendor and validated with
performance tests that confirm the effect of the change.
4. Disaster-recovery options
One of the main
benefits of SAN storage from a DBA's perspective is the enhanced
disaster-recovery options provided. SANs such as the EMC Symmetrix
provide features such as block-level disk replication to a remote data
center and near-instant, split mirror backup and recovery using SQL
Server's VDI.
You should investigate
such features as part of database disaster-recovery planning. While
they may incur additional costs due to software licensing and increased
storage requirements, they are important features to consider as part of
realizing the full investment in the SAN.
SQL Server is capable
of working with most modern SAN solutions from vendors such as EMC,
Hitachi, IBM, and HP. Before purchasing, you should ensure the vendor
guarantees the SAN's compatibility with SQL Server's specific I/O
requirements such as write ordering. These I/O requirements are well
known to SAN vendors and available for download from the Microsoft
website. More information is available at http://support.microsoft.com/default.aspx/kb/967576.