Exchange Server 2013 : Defining a Highly Available Messaging Solution - Achieving High Availability

11/19/2013 6:49:20 PM

Raising the availability of any system has a direct cost implication. Exchange is in a league of its own in terms of interdependency with other systems. Following are a number of factors to consider that will dramatically influence both cost and complexity when planning for high availability:

Process Existing processes for a decentralized, non-highly available system will not suffice when considering a move to a highly centralized or a higher level of availability. Increased hardware cost is only one of the factors to consider with high availability. Another is the cost of a larger number of processes, often interdependent with each other.

User Locations Users may be centralized in one well-connected campus or located in many different geographies across the globe, all with varying connectivity and power, which may or may not be guaranteed.

Servers Server construction may or may not have its own availability factors to consider. Memory bank redundancy, power supply redundancy, processor redundancy, and backplane failure in the case of blade servers are some of the factors that influence total server availability.

Network End users may have varying amounts of bandwidth available to access email as well as different network topologies, which themselves may present points of failure. Do the datacenters hosting the Exchange servers have multiple connection points to the Internet as well as redundant routes to the rest of the network? Are the routing and the switch fabric redundant? Are firewalls and reverse proxy/hygiene solutions redundant?

Power to the Datacenter Power availability is often taken for granted. Events have shown, however, that power may be compromised during extreme weather or may not be guaranteed in some parts of the world at all.

Power to the Racks Is the power to the racks themselves wired so that loss of any one power source or power distribution point within a rack does not affect the entire contents of the rack?

Cooling Cooling is another critical measure of datacenter availability. Do the datacenters housing your servers have redundant cooling available in the event of an outage?

Cloud Solutions You may rely on an external cloud solution for some of the availability of your infrastructure. Does your vendor have a published availability strategy, and does that strategy map to your desired availability goals in a compatible manner?

Virtualization While virtualization carries with it the promise of on-demand capacity and higher levels of availability, Exchange may not fit into your current virtualization strategy. You may be increasing risk and lowering availability by virtualizing Exchange, as opposed to deploying on physical hardware.

Capacity All of the points mentioned thus far have a measure of available capacity that may be overwhelmed or compromised during an outage or a denial-of-service attack.

Single Points of Failure When increasing availability, redundancy of components is a given. However, one not so obvious factor is a single point of failure or, as you learned earlier, a failure domain. A failure domain could include an individual server, power supply, the network, the rack itself, or any datacenter component, including the datacenter itself.

This is not meant as an exhaustive list of all possible factors. Your own analysis of your environment may yield a number of other factors that may be relevant.

Once you identify the factors influencing availability, you can evaluate each in an attempt to mitigate them. For example, let's say your analysis has shown that cooling and the centralization of all IT into a single datacenter represent a single point of failure. The business will remedy the lack of cooling redundancy, but it will not build or rent another datacenter. You may want to capture this and other potential factors as demonstrated in Table 1.

TABLE 1: Availability factors and mitigation

images

When calculating availability (remember that total availability is a product of all the availability factors), the biggest factor influencing total availability is the component(s) in the entire chain that is most likely to fail. When calculating total availability, a single machine is not very redundant, and it is able to drop the total availability of a single factor, such as networking or mail flow, quite significantly.

Others

- Exchange Server 2013 : Defining a Highly Available Messaging Solution - Defining Terms for Availability

- Exchange Server 2013 : Defining a Highly Available Messaging Solution - Defining the Cost of Downtime, Planning for Failure

- Exchange Server 2013 : Defining a Highly Available Messaging Solution - Defining Availability

- About Microsoft SharePoint 2013 : What Is a Workflow?

- About Microsoft SharePoint 2013 : What Is Tagging?

- About Microsoft SharePoint 2013 : What Is a Content Type?

- About Microsoft SharePoint 2013 : What Are Web Parts?

- About Microsoft SharePoint 2013 : What Is a View?

- Feature Overview and Benefits of Microsoft Lync Server 2013 : Remote Access

- Feature Overview and Benefits of Microsoft Lync Server 2013 : Enterprise Voice