Windows Server 2008 : Planning for System Recoverability and Availability

12/1/2012 4:39:00 PM

When you deploy essential servers, such as domain controllers, Web servers, and database servers, you need to plan how to design the system for recoverability in the event of server failure. In the case of a domain controller, you should plan to use Windows Server Backup (or another backup application) to back up the Active Directory Domain Services (AD DS) database. With Web servers and other application servers that need to support many users, you can use Network Load Balancing (NLB). For database servers, mail servers, and other application servers that use a shared database, you can use failover clustering to support recoverability and service availability.

Planning AD DS Maintenance and Recovery Procedures

Before you deploy Windows Server 2008 domain controllers, you need to plan AD DS maintenance and recovery procedures, such as backing up and restoring the AD DS database (Ntds.dit), defragmenting the AD DS database, and seizing operations master roles.

Planning for AD DS Backup

Before you install Windows Server 2008 on a computer you plan to deploy as a domain controller, you should design the storage of that server in a way that best suits its recoverability. Specifically, for each domain controller you should store operating system files, the Active Directory database (Ntds.dit), and the SYSVOL directory all on separate volumes that do not contain other user, operating system, or application data.

The actual backup procedure for AD DS is different in Windows Server 2008 than it is for earlier versions of Windows Server. In Windows Server 2008 you must back up critical volumes on a domain controller rather than backing up only the system state data.

Critical volumes are those that contain the following data:

The volume that hosts the boot files, which consists of the Bootmgr file and the BCD store
The volume that hosts the Windows operating system and the Registry
The volume that hosts the SYSVOL directory
The volume that hosts the Active Directory database (Ntds.dit)
The volume that hosts the Active Directory database log files

Windows Server Backup and Wbadmin

Windows Server 2008 includes a new backup application named Windows Server Backup and an associated command-line tool named wbadmin. These features are not installed by default. You must install them by using the Add Features option in Server Manager.

Note: You cannot back up FAT volumes or partial volumes

Only NTFS-volumes on locally attached disks can be backed up by using Windows Server Backup. In addition, you cannot use Windows Server Backup to back up selected files or folders; you can back up only entire volumes.

You can schedule full server backups and critical-volume backups by using either Windows Server Backup or wbadmin. When determining the frequency for AD DS backups, consider the following:

The frequency of significant changes to AD DS data Significant changes can include changes to the schema, group membership, Active Directory replication or site topology, and policies. They can also include upgrades to operating systems, renaming domain controllers or domains, and migration or creation of new security principals.
The effect on business operations if data in AD DS or SYSVOL is lost Lost data can include updates to passwords for user accounts, computer accounts, and trusts. It can also include updates to group membership, policies, and the replication topology and its schedules.

In general, it is recommended that you perform backups nightly during times of decreased traffic. For fault tolerance, schedule at least two trusted backups for each domain. You can start by scheduling the backups daily and then adjust the frequency of your backups depending on the previously specified criteria.

Finally, note the following considerations when choosing a storage location for your backups:

It is recommended that you create a backup volume on a dedicated internal or attached external hard disk drive.
The destination volume for the backup must be on a separate hard disk from the source volumes.
In Windows Server Backup, you cannot perform a scheduled backup to a network share. Only manual backups can be performed to a network share.
Windows Server Backup does not enable you to back up to tape.

Note: Can you use Windows Server Backup on a Server Core installation?

To use the Windows Server Backup graphical user interface (GUI) for managing backup and restore operations on a server that is running a Server Core installation of Windows Server 2008, you must connect remotely from a server that is running a full installation of Windows Server 2008.

Planning for AD DS Recovery

Planning for AD DS recovery entails learning the recovery procedures, learning when to perform each restore type, and deciding whether to install Windows RE on a dedicated partition as part of domain controller deployment.

AD DS recovery includes performing nonauthoritative restores and authoritative restores. A nonauthoritative restore is what you should perform if the Active Directory volume becomes corrupted or is deleted. To perform a nonauthoritative restore of AD DS, you need at least a critical-volume backup. If you cannot start the server, then you must perform a full server recovery instead.

To perform a nonauthoritative restore, you must restart the domain controller in Directory Services Restore Mode (DSRM). Then you can open Windows Server Backup or use the wbadmin utility to perform the recovery.

Note: Full server recovery and Windows RE

A full server recovery requires you to start the server with the Windows Server 2008 product DVD and choose the Repair Your Computer option. To avoid having to use the operating system media during recovery, use the Windows Automated Installation Kit to install Windows RE on a separate partition. When you install Windows RE beforehand, you can simply choose it from the boot menu and access Windows Recovery options.

Unlike a nonauthoritative restore, the purpose of an authoritative restore is to restore an object that has accidentally been deleted. For example, you might need to perform an authoritative restore if an administrator inadvertently deletes an OU containing a large number of users. If you restore the server from backup, the normal, nonauthoritative restore process does not restore the inadvertently deleted OU because the restored domain controller is updated following the restore process to the current status of its replication partners, which have deleted the OU. Recovering the deleted OU instead requires authoritative restore. You can use authoritative restore to mark the OU as authoritative and let the replication process restore it to all the other domain controllers in the domain.

When an object is marked for authoritative restore, its version number is changed so that it is higher than the existing version number of the (deleted) object in the Active Directory replication system. This change ensures that any data that you restore authoritatively is replicated from the restored domain controller to other domain controllers in the forest.

You should not use an authoritative restore to restore an entire domain controller, nor should you use it as part of a change-control infrastructure. Proper delegation of administration and change enforcement will optimize data consistency, integrity, and security.

To perform an authoritative restore, follow this four-step procedure:

1.	Start the domain controller in DSRM.
2.	Restore the desired backup, which is typically the most recent backup.
3.	Use ntdsutil to mark desired objects, containers, or partitions as authoritative.
4.	Restart in normal mode to propagate the changes.

Stopping AD DS to Perform Maintenance Procedures

Windows Server 2008 introduces a new feature called restartable AD DS that facilitates some Active Directory maintenance procedures. In Windows Server 2008, Active Directory Domain Services appears in the Services console as a service that can be stopped and restarted like any other service. Stopping the AD DS service enables you to perform an offline defragmentation or update of a locally stored AD DS database while you are logged on to a domain controller normally. In earlier versions of Windows you needed to start the computer in DSRM to perform such procedures.

While AD DS is stopped on a particular domain controller, other domain controllers can still service new domain logon requests. Even on the domain controller on which AD DS is stopped, you can continue to log on to the domain if other domain controllers are available to service the logon request. If no other domain controller is available, you can still log on to the server in DSRM by using the local Administrator account and the DSRM password, as in Windows 2000 Server or Windows Server 2003.

Note: Can you use dcpromo to remove AD DS when AD DS is stopped?

You can run dcpromo /forceremoval to forcefully remove AD DS from a domain controller while AD DS is stopped. However, you should use this procedure only if AD DS cannot be started.

Aside from improving the convenience of performing offline maintenance procedures to the AD DS database, stopping the AD DS service provides the additional benefit of preserving the availability of other services while you are performing those maintenance tasks. For example, if a domain controller is also a DHCP server, the domain controller can continue to service DHCP clients when you are performing offline maintenance on AD DS.

Note: Stopping AD DS at a command line

To stop AD DS at a command line, type net stop ntds.

Seizing Operations Master Roles

Certain domain and enterprise-wide services that are not suitable for multimaster updates are performed by a single domain controller in AD DS. The domain controllers that are assigned to perform these unique operations are called operations masters or flexible single master operations (FSMO) role holders. If a domain controller that holds an operations master role is lost and cannot be brought back online, you can use the ntdsutil utility to seize the lost operations master role.

A domain controller whose FSMO roles have been seized should not be permitted to communicate with existing domain controllers in the forest. In this scenario, you should either format the hard disk and reinstall the operating system on such domain controllers or forcibly demote such domain controllers on a private network and then remove their metadata on a surviving domain controller in the forest by using the ntdsutil /metadata cleanup command.

Using Network Load Balancing to Support High-Usage Servers

Network Load Balancing (NLB) is used to support a highly used network service or application. An installable feature of Windows Server 2008, NLB transparently distributes client requests among servers in a cluster by using virtual IP addresses and a shared name. From the perspective of the clients, the NLB cluster appears to be a single server.

In a common scenario, for example, NLB is used to create a Web farm—a group of computers working to support a Web site or a set of Web sites. In some scenarios it might be possible that a single, powerful server could be used to support the client traffic instead of many smaller Web servers in an NLB farm. However, an NLB farm enables you to gradually increase the power of your solution by adding more servers (called hosts) to the farm as the need arises. NLB also provides the advantage of high availability because in such a cluster there is no single point of failure.

Aside from Web farms, you can also use NLB to create a terminal server farm, a virtual private network (VPN) server farm, or an ISA Server firewall cluster. Figure 1 shows a basic configuration of an NLB Web farm located behind an NLB firewall cluster.

Figure 1. Basic diagram for two connected NLB clusters

As a load balancing mechanism, NLB automatically detects servers that have been disconnected from the cluster and then redistributes client requests to the remaining live hosts. This feature prevents clients from sending requests to the failed servers. NLB also allows you the option to specify a load percentage that each host will handle. Clients are then statistically distributed among hosts so that each server receives its percentage of incoming requests.

Identifying Applications for NLB

The applications and services that run on NLB include stateful applications (those that maintain session state) and stateless applications. Maintaining session state means that the application or service collects information when first connecting to a cluster host and then retains the information for subsequent requests. During a user session, the same server must handle all the requests from the user in order to access that information. Applications and services that are stateless maintain no user or communication information for subsequent connections.

With a single server, maintaining session state presents no difficulty because the user always connects to the same server. However, when client requests are load balanced within an NLB cluster, without some type of persistence the client might not be directed to the same host for a series of client requests.

In NLB you maintain session state with a port rule affinity between the client and a specific cluster host. Port rule affinity directs all client requests from the same IP address to the same NLB host. You can use port rules to specify the port rule affinity between clients and NLB cluster hosts.

Some of the common applications and services well-suited to run on NLB include the following:

Web applications One of the most common of the solutions that use NLB is a Web farm. A typical challenge in supporting Web applications occurs when an application must maintain a persistent connection to a specific cluster host. For example, if a Web application uses Hypertext Transfer Protocol Secure (HTTPS), the application should, for efficiency, contact the same cluster hosts within the cluster. Connecting to a different cluster host requires establishing a new SSL session, which creates excess network traffic and overhead on the client and server. NLB maintains affinity and reduces the possibility that a new SSL session needs to be established.
VPN remote access running on Routing and Remote Access Another solution that uses NLB involves using the Routing and Remote Access service in Windows Server 2008 to provide VPN remote connectivity. In the VPN solution, you combine multiple remote access servers running Windows Server 2008 and Routing and Remote Access to create a VPN remote access server farm.
Web content caching and firewall running on ISA Server You can also use NLB in solutions that include ISA Server to provide network security, network isolation, network address translation, or Web content caching. In ISA Server solutions, the design and deployment are integral parts of the ISA Server design and deployment process.
Application hosted on Terminal Services When you run applications on Terminal Services, the Terminal Services clients can be load balanced across a number of computers running Terminal Services. NLB works with the Terminal Services Session Broker role service to provide improved scalability and availability for Terminal Services.
Custom applications NLB might be an appropriate method of improving scalability and availability for applications that your organization or third-party organizations have developed. Custom applications must adhere to the same criteria listed earlier in this section.

When Not to Use NLB

In NLB each host in the farm is connected to separate storage, and this data is not replicated among hosts. As a result, NLB is not well-suited to support services in which data is updated by users because data inconsistency among nodes could result. In particular, you should not use NLB to support database servers or file servers. However, many organizations use NLB to support a Web site front end to a single database server.

Using Failover Clusters to Maintain High Availability

A failover cluster is a group of two or more computers used to prevent downtime for selected applications and services. The clustered servers (called nodes) are connected by physical cables to each other and to shared storage disks. If one of the cluster nodes fails, another node begins to take over service for the lost node in a process known as failover. As a result of failover, users connecting to the server experience minimal disruption in service.

Servers in a failover cluster can function in a variety of roles, including the roles of file server, print server, mail server, or database server, and they can provide high availability for a variety of other services and applications.

In most cases the failover cluster includes a shared storage unit that is physically connected to all the servers in the cluster, although any given volume in the storage is accessed by only one server at a time.

Figure 2 illustrates the process of failover in a basic two-node failover cluster.

Figure 2. In a failover cluster, when one server fails, another takes over using the same storage

Server clusters can benefit your organization if:

Your users depend on regular access to mission-critical data and applications to do their jobs.
Your organization has established a limit on the amount of planned or unplanned service downtime that you can sustain.
The cost of the additional hardware that server clusters require is less than the cost of having mission-critical data and applications offline during a failure.

Comparing NLB and Failover Clusters

NLB clusters and failover clusters are used for different purposes. Whereas NLB is used primarily for increased scalability of Web servers, VPN servers, ISA Server firewalls, and terminal servers, failover clusters are often used most often to increase the availability of database servers. Frequently, in fact, NLB clusters can work as a front end to a failover cluster, as in the case of a Web site that connects to a back-end database, illustrated in Figure 3.

Figure 3. An NLB cluster often acts as the front end to a back-end failover cluster

Preparing Failover Cluster Hardware

Failover clusters have fairly elaborate hardware requirements. To configure the hardware, review the following list of requirements for the servers, network adapters, cabling, controllers, and storage:

Servers Use a set of matching computers that contain the same or similar components. (Recommended)
Network adapters and cabling The network hardware, like other components in the failover cluster solution, must be compatible with Windows Server 2008. If you use iSCSI, your network adapters must be dedicated to either network communication or iSCSI, not both.
In the network infrastructure that connects your cluster nodes, avoid having single points of failure. There are several ways to achieve this. You can connect your cluster nodes by multiple, distinct networks. Alternatively, you can connect your cluster nodes with one network that is constructed with teamed network adapters, redundant switches, redundant routers, or similar hardware that removes single points of failure.
Device controllers or appropriate adapters for the storage For Serial Attached SCSI or Fibre Channel: If you are using Serial Attached SCSI or Fibre Channel, in all clustered servers the mass-storage device controllers that are dedicated to the cluster storage should be identical. They should also use the same firmware version.
For iSCSI: If you are using iSCSI, each clustered server must have one or more network adapters or host bus adapters (HBAs) that are dedicated to the cluster storage. The network you use for iSCSI cannot be used for network communication. In all clustered servers, the network adapters you use to connect to the iSCSI storage target should be identical. It is also recommended that you use Gigabit Ethernet or higher. (Note also that for iSCSI you cannot use teamed network adapters.)
Storage: You must use shared storage that is compatible with Windows Server 2008 For a two-node failover cluster, the storage should contain at least two separate volumes configured at the hardware level.
The first volume will function as the witness disk. A witness disk is a volume that holds a copy of the cluster configuration database. Witness disks, known as quorum disks in Windows Server 2003, are used in many, but not all, cluster configurations.
The second volume will contain the files that are being shared to users. Storage requirements include the following:
- To use the native disk support included in failover clustering, use basic disks, not dynamic disks.
- It is recommended that you format the storage partitions with NTFS (for the witness disk, the partition must be NTFS).
When deploying a storage area network (SAN) with a failover cluster, be sure to confirm with manufacturers and vendors that the storage, including all drivers, firmware, and software used for the storage, are compatible with failover clusters in Windows Server 2008.

After you have met the hardware requirements and connected the cluster servers to storage, you can then install the Failover Cluster feature.

Others

- Windows Server 2008 : Choosing Data Security Solutions

- Windows Home Server 2011 : Using a Linux Client on Your Windows Home Server Network

- Windows Server 2008 Server Core : Scheduling and Managing Tasks (part 3) - Combining the AT Utility with Batch Files, Creating Script-Based Scheduler Activities

- Windows Server 2008 Server Core : Scheduling and Managing Tasks (part 2) - Working with the AT Utility, Working with the WMIC Job Alias

- Windows Server 2008 Server Core : Scheduling and Managing Tasks (part 1) - Managing Tasks with the SCHTasks Command

- Exploring Windows 8 Apps : Getting New Apps from the Windows Store (part 2) - Searching for an App, Installing an App

- Exploring Windows 8 Apps : Getting New Apps from the Windows Store (part 1) - Introducing the Windows Store

- Exploring Windows 8 Apps : Closing Apps

- Windows 7 Networking : Setting Up Your Connection (part 2) - Setting Up Your Home Network

- Windows 7 Networking : Setting Up Your Connection (part 1)