Exchange Server 2010 : Storage Availability - Direct Attached Storage, Storage Area Networks

8/25/2013 9:30:48 AM

Many administrators and IT professionals immediately think of storage designs when they hear the word availability. While storage is a critical part of ensuring the overall service availability of an Exchange organization, the impact of storage design is far more than just availability; it directly affects performance, reliability, and scalability.

1. An Overview of Exchange Storage

In medium-sized and large organizations, the Exchange administrator is usually not also responsible for storage. Many medium-sized and large organizations use specialized storage area networks (SANs) that require additional training to master. Storage is a massive topic, but we feel it is important that you at least be able to speak the language of storage.

From the very beginning, messaging systems have had a give-and-take relationship with the underlying storage system. Even on systems that aren't designed to offer long-term storage for email (such as ISP systems that offer only POP3 access), email creates demands on storage:

The transport (MTA) components must have space to queue messages that cannot be immediately transmitted to the remote system.
The MDA component must be able to store incoming messages that have been delivered to a mailbox until users can retrieve them.
The message store, in systems like Exchange, permits users to keep a copy of their mailbox data on central servers.
As the server accepts, transmits, and processes email, it keeps logs with varying levels of detail so administrators can troubleshoot and audit activities.

Direct attached storage is the most common type of storage in general. DAS disks are usually internal disks or directly attached via cable. Just about every server, except for some high-end varieties such as blade systems running on boot-over-SAN, uses DAS at some level; typically, at least the boot and operating system volumes are on some DAS configuration. DAS, however, has drawbacks for use with Exchange storage: it doesn't necessarily scale as well for either capacity or performance. Further organizations that have invested significant amounts of money in their SANs may still require that Exchange use the SAN instead of DAS.

To solve these problems, people looked at NAS devices as one of the potential solutions. These machines — giant file servers — sit on the network and share their disk storage. They range in price and configuration from small plug-in devices with fixed capacity to large installations with more configuration options than most luxury cars (and a price tag to match). Companies that bought these were using them to replace file servers, web server storage, SQL Server storage — why not Exchange?

For many years, Exchange Server wasn't compatible with NAS devices; Microsoft didn't support moving Exchange storage to NAS, and vociferously argued against the idea. But ultimately Microsoft supported NAS devices for Exchange 2003.

Apparently, despite all the people asking for NAS support in Exchange 2003, it didn't turn out to be a popular option, because NAS devices were no longer supported for Exchange Server 2007 and beyond. Instead, the push switched to reducing the overall I/O requirements so that DAS configurations become practical for small to midsized organizations. Exchange 2007 moved to a 64-bit architecture to remove memory management bottlenecks in the 32-bit Windows kernel, allowing the Exchange Information Store to use more memory for intelligent mailbox data caching and reduce disk I/O. Exchange 2010 in turn makes aggressive changes to the on-disk mailbox database structures, such as moving to a new database schema that allows pages to be sequentially written to the end of the database file rather than randomly throughout the file. The schema updates improve indexing and client performance, allowing common tasks such as updating folder views to happen more quickly while requiring fewer disk reads and writes. These changes help improve efficiency and continue to drive mailbox I/O down.

The premise behind SAN is to move disks to dedicated storage units that can handle all the advanced features you need — high-end RAID configurations, hot-swap replacement, on-the-fly reconfiguration, rapid disk snapshots, tight integration with backup and restore solutions, and more. This helps consolidate the overhead of managing storage, often spread out on dozens of servers and applications (and their associated staff), into a single set of personnel. Then, dedicated network links connect these storage silos with the appropriate application servers. Yet this consolidation of storage can also be a serious pitfall since Exchange is usually not the only application placed on the SAN. Applications such as SharePoint, SQL, archiving, and file services may all be sharing the same aggregated set of spindles and cause disk contention.

2. Direct Attached Storage

When early versions of Exchange Server came on the market, DAS was just the way you did things. As used for legacy Exchange storage, DAS historically displays two main problems: performance and capacity. As mailbox databases got larger and traffic levels rose, pretty soon people wanted to look for alternatives; DAS storage under Exchange 2000 and Exchange 2003 required a lot of disks, because Exchange's I/O profile was optimized only for the 32-bit architecture that Windows provided at the time. Quite simply, with a fixed amount of RAM available for caching, the more simultaneous users there were on an Exchange 2003 server, the less cache per user was available.

To get more scalability on logical disks that support Exchange databases, you can always try adding more disks to the server. This gives you a configuration known as Just a Bunch of Disks (JBOD).

Although JBOD can usually give you the raw disk storage capacity you need, it has three flaws that render it unsuitable for all but the smallest legacy Exchange deployments:

JBOD forces you to partition your data: Because each disk has a finite capacity, you can't store data on that disk if it is larger than the capacity. For example, if you have four 250 GB drives, even though you have approximately one terabyte of storage in total, you have to break that up into separate 250 GB partitions. Historically, this has caused some interesting design decisions in messaging systems that rely on file system–based storage.
JBOD offers no performance benefits: Each disk is responsible for only one chunk of storage, so if that disk is already in use, subsequent I/O requests will have to wait for it to free up before they can go through. A single disk can thus become a bottleneck for the system, which can slow down mail for all your users (not just those whose mailboxes are stored on the affected disk).
JBOD offers no redundancy: If one of your disks dies, you're out of luck unless you can restore that data from backup. True, you haven't lost all your data, but the one-quarter of your users who have just lost their email are not likely to be comforted by that observation.

Several of the Exchange 2010 design goals have focused on building in the necessary features to work around these issues and make a DAS JBOD deployment a realistic option for more organizations. However, legacy versions of Exchange contain no mechanisms to work around these issues. Luckily, some bright people came up with a great generic answer to JBOD that also works well for legacy Exchange: the Redundant Array of Inexpensive Disks (RAID).

The basic premise behind RAID is to group the JBOD disks together in various configu-rations with a dedicated disk controller to handle the specific disk operations, allowing the computer (and applications) to see the entire collection of drives and controller as one very large disk device. These collections of disks are known as arrays; the arrays are presented to the operating system, partitioned, and formatted as if they were just regular disks. The common types of RAID configurations are shown in Table 1.

Table 1. RAID Configurations
Raid Level	Name	Description
None	Concatenated drives	Two or more disks are joined together in a contiguous data space. As one disk in the array is filled up, the data is carried over to the next disk. Though this solves the capacity problem and is easy to implement, it offers no performance or redundancy whatsoever, and makes it more likely that you're going to lose all your data, not less, through a single disk failure. These arrays are not suitable for use with legacy Exchange servers.
RAID 0	Striped drives	Two or more disks have data split among them evenly. If you write a 1 MB file to a two-disk RAID 0 array, half the data will be on one disk, half on the other. Each disk in the array can be written to (or read from) simultaneously, giving you a noticeable performance boost. However, if you lose one disk in the array, you lose all your data. These arrays are typically used for fast, large, temporary files, such as those in video editing. These arrays are not suitable for use with Exchange; while they give excellent performance, the risk of data loss is typically unacceptable.
RAID 1	Mirrored drives	Typically done with two disks (although some vendors allow more), each disk receives a copy of all the data in the array. If you lose one disk, you've still got a copy of your data on the remaining disk; you can either move the data or plug in a replacement disk and rebuild the mirror. RAID 1 also gives a performance benefit; reads can be performed by either disk, because only writes need to be mirrored. However, RAID 1 can be one of the more costly configurations; to store 500 GB of data, you'd need to buy two 500 GB drives. These arrays are suitable for use with legacy Exchange volumes, depending on the type of data and the performance of the array.
RAID 5	Parity drive	Three or more disks have data split among them. However, one disk's worth of capacity is reserved for parity checksum data; this is a special calculated value that allows the RAID system to rebuild the missing data if one drive in the array fails. The parity data is spread across all the disks in the array. If you had a four-disk 250 GB RAID 5 array, you'd only have 750GB of usable space. RAID 5 arrays offer better performance than JBOD, but worse performance than other RAID configurations, especially on the write requests; the checksum must be calculated and the data + parity written to all the disks in the array. Also, if you lose one disk, the array goes into degraded mode, which means that even read operations will need to be recalculated and will be slower than normal. These arrays are suitable for use with legacy Exchange mailbox database volumes on smaller servers, depending on the type of data and the performance of the array. Due to their write performance characteristics, they are usually not well matched for transaction log volumes.
RAID 6	Double parity drive	This RAID variant has become common only recently, and is designed to provide RAID 5 arrays with the ability to survive the loss of two disks. Other than offering two-disk resiliency, base RAID 6 implementations offer mostly the same benefits and drawbacks as RAID 5. Some vendors have built custom implementations that attempt to solve the performance issues. These arrays are suitable for use with Exchange, depending on the type of data and the performance of the array.
RAID 10 RAID 0+1 RAID 1+0	Mirroring plus striping	A RAID 10 array is the most costly variant to implement because it uses mirroring. However, it also uses striping to aggregate spindles and deliver blistering performance, which makes it a great choice for high-end arrays that have to sustain a high-level of I/O. As a side bonus, it also increases your chances of surviving the loss of multiple disks in the array. There are two basic variants. RAID 0+1 takes two big stripe arrays and mirrors them together; RAID 1+0 takes a number of mirror pairs and stripes them together. Both variants have essentially the same performance numbers, but 1+0 is preferred because it can be rebuilt more quickly (you only have to regenerate a single disk) and has far higher chances of surviving the loss of multiple disks (you can lose one disk in each mirror pair). These arrays have traditionally been used for high-end highly loaded legacy Exchange mailbox database volumes.

Note that several of these types of RAID arrays may be suitable for your Exchange server. Which one should you use? The answer to that question depends entirely on how many mailboxes your servers are holding, how they're used, and other types of business needs. Beware of anyone who tries to give hard-and-fast answers like, "Always use RAID 5 for Exchange database volumes." To determine the true answer, you need to go through a proper storage sizing process, find out what your I/O and capacity requirements are really going to be, think about your data recovery needs and service level agreements (SLAs), and then decide what storage configuration will meet those needs for you in a fashion you can afford. There are no magic bullets.

In every case, the RAID controller you use — the piece of hardware, plus drivers, that aggregates the individual disk volumes for you into a single pseudo-device that is presented to Windows — plays a key role. You can't just take a collection of disks, toss them into slots in your server, and go to town with RAID. You need to install extra drivers and management software, you need to take extra steps to configure your arrays before you can even use them in Windows, and you may even need to update your disaster recovery procedures to ensure that you can always recover data from drives in a RAID array. Generally, you'll need to test whether you can move drives in one array between two controllers, even those from the same manufacturer; not all controllers support all options. After your server has melted down and your SLA is fast approaching is not a good time to find out that you needed to carry a spare controller on hand.

If you choose the DAS route (whether JBOD or RAID), you'll need to think about how you're going to house the physical disks. Modern server cases don't leave a lot of extra room for disks; this is especially true of rack-mounted systems. Usually, this means you'll need some sort of external enclosure that hooks back into a physical bus on your server, such as SAS or eSATA disks. Make sure to give these enclosures suitable power and cooling; hard drives pull a lot of power and return it all eventually as heat.

Also make sure that your drive backplanes (the physical connection point) and enclosures support hot-swap capability, where you can easily pull the drive and replace it without powering the system down. Keep a couple of spare drives and drive sleds on hand, too. You don't want to have to schedule an outage of your Exchange server in order to replace a failed drive in a RAID 5 array, letting all your users enjoy the performance hit of a thrashing RAID volume because the array is in degraded mode until the replacement drives arrive.

RAID Controllers Are Not All Created Equal

Beware! Not all kinds of RAID are created equal. Before you spend a lot of time trying to figure out which configuration to choose, first think about your RAID controller. There are three kinds of them, and unlike RAID configurations, it's pretty easy to determine which kind you need for Exchange:

Software RAID: Software RAID avoids the whole problem of having a RAID controller by performing all the magic in the operating system software. If you convert your disk to dynamic volumes, you can do RAID 0, RAID 1, and RAID 5 natively in Windows 2008 without any extra hardware. However, Microsoft strongly recommends that you not do this with Exchange, and the Exchange community echoes that recommendation. It takes extra memory and processing power, and inevitably slows your disks down from what you could get with a simple investment in good hardware. You will also not be able to support higher levels of I/O load with this configuration, in our experience.
BIOS RAID: BIOS RAID attempts to provide "cheap" RAID by putting some code for RAID in the RAID chipset, which is then placed either directly on the motherboard (common in workstation-grade and low-end server configurations) or on an inexpensive add-in card. The dirtylittlesecretisthatthe RAID chipsetisn't really doing the RAID operations in hardware; again it's all happening in software, this time in the associated Windows driver (which is written by the vendor) rather than an official Windows subsystem. If you're about to purchase a RAID controller card for a price that seems too good to be true, it's probably one of these cards. These RAID controllers tend to have a smaller number of ports, which limits their overall utility. Although you can get Exchange to work with them, you can do so only with very low numbers of users. Otherwise, you'll quickly hit the limits these cards have and stress your storage system. Just avoid them; the time you save will more than make up for the up-front price savings.
Hardware RAID: This is the only kind of RAID you should even be thinking about for your Exchange servers. This means good-quality, high-end cards that come from reputable manufacturers that have taken the time to get the product on the Windows Hardware Compatibility List (HCL). These cards do a lot of the work for your system, removing the CPU overhead of parity calculations from the main processors, and they are worth every penny you pay for them. Better yet, they'll be able to handle the load your Exchange servers and users throw at them.

If you can't tell whether a given controller you're eyeing is BIOS or true hardware RAID, get help. Lots of forums and websites on the Internet will help you sort out which hardware to get and which to avoid. And while you're at it, spring a few extra bucks for good, reliable disks. We cannot stress enough the importance of not cutting corners on your Exchange storage system; while Exchange 2010 gives you a lot more room for designing storage and brings back options you may not have had before, you still need to buy the best components that you can to make up the designed storage system. The time and long-term costs you save will be your own.

3. Storage Area Networks

Initial SAN solutions used fiber-optic connections to provide the necessary bandwidth for storage operations. As a result, these systems were incredibly expensive and were used only by organizations with deep pockets. The advent of Gigabit Ethernet over copper and new storage bus technologies such as SATA and SAS, however, has moved the cost of SANs down into the realm where midsized companies can now afford both the sticker price and the resource training to become competent with these new technologies.

Over time, many vendors have begun to offer SAN solutions that are affordable even for small companies. The main reason they've been able to do so is the iSCSI protocol; block-based file access routed over TCP/IP connections. Add iSCSI with ubiquitous Gigabit Ethernet hardware, and SAN deployments have become a lot more common.

Clustering and high availability concerns are the other factors in the growth of Exchange/SAN deployments. Exchange 2003 supported clustered configurations but required the cluster nodes to have a shared storage solution. As a result, any organization that wanted to deploy an Exchange cluster needed some sort of SAN solution (apart from the handful of people who stuck with shared SCSI configurations). A SAN has a certain elegance to it; you simply create a virtual slice of drive space for Exchange (called a LUN, or logical unit number), use Fibre Channel or iSCSI (and corresponding drivers) to present it to the Exchange server, and away you go. Even with Exchange 2007 — which was reengineered with an eye toward making DAS a supportable choice for Exchange storage in specific CCR and SCR configurations — many organizations still found that using SAN for Exchange storage was the best answer for their various business requirements. By this time, management had seen the benefits of centralized storage management and wanted to ensure that Exchange deployments were part of the big plan.

However, SAN solutions don't fix all problems, even with (usually because of) their price tag. Often, SANs make your environment even more complex and difficult to support. Because SANs cost so much, there is often a strong drive to use the SAN for all storage and make full use of every last free block of space. The cost per GB of storage for a SAN can be between three and ten times as expensive as DAS disks. Unfortunately, Exchange's I/O characteristics are very different than those of just about any other application, and few dedicated SAN administrators really know how to properly allocate disk space for Exchange:

SAN administrators do not usually understand that total disk space is only one component of Exchange performance. For day-to-day operations, it is far more important to ensure enough I/O capacity. Traditionally, this is delivered by using lots of physical disks (commonly referred to as "spindles") to increase the amount of simultaneous read/write operations supported. It is important to make sure the SAN solution provides enough I/O capacity, not just free disk space, or Exchange will crawl.
Even if you can convince them to configure LUNs spread across enough disks, SAN administrators immediately want to reclaim that wasted space. As a result, you end up sharing the same spindles between Exchange and some other application with its own performance curve, and then suddenly you have extremely noticeable but hard-to-diagnose performance issues with your Exchange servers. Shared spindles will kill Exchange.
Although some SAN vendors have put a lot of time and effort into understanding Exchange and its I/O needs so that their salespeople and certified consultants can help you deploy Exchange on their products properly, not everyone does the same. Many vendors will shrug off performance concerns by telling you about their extensive write caching and how good write caching will smooth out any performance issues. Their argument is true ... up to a point. A cache can help isolate Exchange from effects of transient I/O events, but it won't help you come Monday morning when all your users are logging in and the SQL Server databases that share your spindles are churning through extra operations.

The moral of the story is simple: don't believe that you need to have a SAN. This is especially true with Exchange 2010; there have been a lot of under-the-hood changes to the mailbox database storage to ensure that more companies can deploy a 7200 RPM SATA JBOD configu-ration and be able to get good performance and reliability from that system.

If you do find that a SAN provides the best value for your organization, get the best one you can afford. Make sure that your vendors know Exchange storage inside and out; if possible, get them to put you in contact with their on-staff Exchange specialists. Have them work with your SAN administrators to come up with a storage configuration that meets your real Exchange needs.

Others

- Exchange Server 2010 : A Closer Look at Availability - Service Availability, Network Availability, Data Availability

- Exchange Server 2010 : What's in a Name? (part 3) - Management Frameworks

- Exchange Server 2010 : What's in a Name? (part 2) - Location

- Exchange Server 2010 : What's in a Name? (part 1) - Backup and Recovery, Disaster Recovery

- Windows 8 : Managing Content - Homegroups (part 2) - To enable media streaming, To join a homegroup

- Windows 8 : Managing Content - Homegroups (part 1) - To create a homegroup , To modify an existing homegroup

- Sharepoint 2013 : Usage and Health Data Collection

- Sharepoint 2013 : Using and Configuring the Health Analyzer

- Sharepoint 2013 : Configuring Monitoring in Central Administration (part 2) - Configuring ULS via PowerShell

- Sharepoint 2013 : Configuring Monitoring in Central Administration (part 1) - Unified Logging Service, Configuring ULS via Central Admin

Exchange Server 2010 : Storage Availability - Direct Attached Storage, Storage Area Networks

1. An Overview of Exchange Storage

2. Direct Attached Storage

Table 1. RAID Configurations

RAID Controllers Are Not All Created Equal

3. Storage Area Networks