VMware View 5 Architecture : Virtual (part 1) - Infrastructure Introduction

3/31/2015 3:24:53 AM

My approach to best practices is that they are useful guides to getting to where you want to go (as are books). You should always know the technology well enough to consider whether the best practice still applies or if you need to make adjustments. As an example, consider the concept of running vCenter as a virtual machine. In the history of VMware virtual infrastructure, the notion of running Virtual Center/vCenter as a virtual machine was a hotly contested topic (it may still be). VMware does consider it a best practice to run vCenter as a VM but not on hosts it is managing. I have worked with a few customers who insist that the management function be deployed on a physical machine. Rather than rely on a best practice, perhaps you need to understand the benefits versus risks from taking one approach over the other and adjust your design accordingly.

As consultants, we are often asked to put forth a design that meets the business requirements and provides the level of performance, availability, and scale required. The ideal approach to developing a design is to perform a capacity planning exercise to ensure that the hardware and software can be properly estimated to run the virtual desktop workload. Capacity planning is quite common in server virtualization environments but not as common in virtual desktop planning, although it is recommended. A number of tools are specifically designed for virtual desktop analysis, such as Lakeside Software’s SysTrack VP tool. SysTrack VP is a tool that is designed to provide information to help in planning your virtual desktop environment, such as inventorying the software in your desktop environment. It is agent based and allows you to take the collected data and model the configuration of the hosts by adjusting CPU and memory values and determining how many virtual desktop images you are likely to need. You can find additional information on the product at http://www.lakesidesoftware.com.

Understanding the configuration of the hosts and the number of images allows you to calculate the cost of the solution and is a key input in developing the ROI and TCO. To calculate ROI, you simply take the gain of an investment, subtract the cost of the investment, and divide the total by the cost of the investment. Or

ROI = (Gains – Cost)/Cost

Because it is important to understand the ROI when presenting the business case for virtual desktops, it is a good idea to calculate the ROI even if you need to estimate the gains. Keep in mind that gains can include hard cost savings such as the difference between thin clients and physical desktops and soft cost savings such as reducing the cost of desktop support.

Infrastructure Introduction

When you are considering a large deployment of VMware View, it is best to follow all the steps in developing valid hardware estimates. These steps are as follows:

1. Develop a baseline of current utilization in the desktop environment. The physical desktop baseline should be viewed as a starting point because the inclusion of many additional technologies such as View Composer often provides higher consolidation of features such as images in the virtual desktop environment versus the physical one.

Initially, you used the same set of tools to perform virtual desktop assessments that you used in server consolidation exercises. Over time better tools were developed that now provide not just capacity planning information, but application inventory and license compliance. These tools can also assess whether or not an application is a candidate to be virtualized by ThinApp. It’s ironic that with the exception of the application virtualization piece this is exactly the information you would need if you were planning a physical desktop migration in a large environment.

2. Estimate the hardware required to build a limited scale or proof of concept (PoC) to validate what features you will make use of in the VMware View platform and your hardware specifications (this should include not just servers but also storage space and throughput information). The PoC should also consider user segmentation or the types of users in an organization, such as knowledge and administrative users. To provide a viable reference for the production deployment, the PoC should include a proper variety of user segments.

3. Develop a production architecture and migration plan.

Although this approach is ideal, it is not the only one. Often virtual desktop engagements begin with limited-scale proof of concept environments versus a capacity planning exercise. A PoC, if properly designed, can be a great way of gathering information on what the “real” or representative workload will be for your production virtual desktop environment. By looking at the performance utilization within the PoC, you are able to extrapolate what is required to build out the production environment. You should baseline the information related to CPU, memory, and storage, including I/O.

The storage I/O information is very important and can be difficult to get a handle on. If you are dealing with a storage vendor and that vender makes a distinction between virtual desktop environments and server virtualization environments, it often has general sizing numbers to develop throughput specifications for Virtual Desktop Infrastructure (VDI) environments. What is unusual about virtual desktop environments is that two very different disk I/O conditions exist: burst and operational I/O. Burst I/O is more common in VDI environments because operational requirements necessitate large reboots of desktop operating systems not typical in virtual server environments. Operational I/O can also be problematic if factors such as virus scanning activities are synchronized based on time versus randomized to reduce the performance hit on the VMs. Even if you are careful in randomizing the activity, often AV scans follow very specific patterns. In a physical desktop world, this is minimal; in a virtualized environment, it can have a substantial impact.

Some storage vendors have a very utilitarian view of storage services; they do not view virtual desktop workloads as any more unique than other virtual workloads. The limitation with SAN vendors who do not differentiate between server and desktop virtualization environments is that to guarantee good throughput, you may have to consider their enterprise class storage systems for good performance.

Other storage vendors provide midtier solutions and solid state drives to deal with burst I/O. Although this approach is better, it still requires you to adjust your design so that high I/O requirements are segregated onto volumes made up of solid-state drives (SSDs). This leads to a very static design in which you may or may not make good use of high-performance drives. A growing number of options are available for I/O offload, such as cache cards (for example, Fusion IO; http://www.fusionio.com) or memory-based virtual appliance proxies for consolidating and dealing with I/O (such as Atlantis Computing’s ILIO product; http://www.atlantiscomputing.com). In addition, storage vendors have designed solutions for virtualization consolidation and more specifically around the high I/O of virtual desktop workloads.

Most recently, storage vendors have started to build midtier storage systems that have some of the features of enterprise class systems such as dynamic tiering. Dynamic tiering is the capability to move hot data, or data that is in demand, to high-performance drives so that the SAN delivers great performance. This activity can typically be done on the fly or scheduled to happen periodically during the day. These solutions are ideal for virtual desktop environments because they do not require the premium of enterprise class storage systems but still deliver the features. EMC has clearly targeted the VNX line to provide features that make them ideally suited for virtual workloads. Of course, companies such as NetApp have been using programmable acceleration module (PAM) cards for years to deal with burst I/O.

Whichever solution you select, here are a few general considerations for putting together your design.

The difference between SAN solutions designed specifically for virtualization consolidation and some of the I/O offload products is in their application although they can be used to complement each other. If you are building a large virtual desktop environment and you have the option of architecting a dedicated SAN, you can plan for high I/O conditions. If you are integrating into a SAN framework shared across the entire organization, you know you may have to offload or boost the I/O provided.

Each SAN vendor has very different numbers when estimating I/Os for virtual/virtual desktop workloads. It is best to have your own reference numbers based on internal testing. Use these numbers to make sure the estimates provided meet your requirements.

Burst I/O and operational I/O are treated distinctly by most storage vendors. For example, if your numbers estimate that your environment may generate 15,000 burst I/Os and require 4 TB of storage, the vendor may suggest 6 X SSD drives (6 × 2500 IOs each = 15K burst, excluding RAID considerations) and approximately 12 of the 450 GB SAS drives to meet your operational I/O and total storage capacity. In this way, I/O and storage capacity are treated distinctly by the configuration.

Ensure that your virtual desktop design incorporates the SAN environment. A good design should provide consistent performance over the lifetime of the solution (typically three years). Achieving this result is not possible if you build a great VDI design that does not set specific requirements for storage. Although your VDI environment may run great during the first year, you may see high SAN utilization lead to problems over time.

Separate your expected read and write I/Os. Take the number of writes and ensure you factor the number by 4 to allow for an I/O penalty on writes. For example, if you expect 2000 reads and 2000 writes, multiply the writes by 4 for a total of 10,000 expected I/Os (2000 read I/O + 8000 write I/O).

One of the unique features of ESXi is the capability to use local SSDs. If you combine this capability to use local SSDs and incorporate it in your design, you can heavily subsidize your I/O requirement for storage. Doing so requires a little more consideration because local SSD drive partitions are not shared between ESXi hosts as SAN storage is. Because these virtual desktops would be localized, you would have to ensure that any data is nonpersistent in nature.

The design of VMware View can change dramatically because of the support of SSD drives in vSphere. Where before you spent a lot of time ensuring that the storage provided adequate throughput, now you have the option of also designing nonpersistent or floating VMs on localized SSD drives.

By factoring in both local and SAN options, you can reduce the overall price per desktop. This amount can be considerable depending on the percentage of persistent versus nonpersistent or floating desktops. SSDs change the framework considerably because they can provide incredible read I/O performance and impressive write performance. Although different benchmarking produces a variety of different results, it is not uncommon for SSD drives to deliver 25,000–30,000 read I/Os and 4,000–5,000 write I/Os. The only drawback with SSD drives is that they are still relatively expensive and still have a limited amount of storage space although this situation gets better and better every year. As of the time of this writing, an SSD with 600 GB of space is available.

VMware provides reference architecture for stateless virtual desktops in which they use SSD drives. It is not possible to apply this reference architecture as is to production, however, because most environments consist of both stateful and stateless virtual desktops. Using local SSDs is an option in vSphere 5 but does require some additional planning in your View architecture because you will have components of the virtual desktop environments configured on local SSDs, as shown in Figure 1.

Figure 1. Using local SSDs is possible in vSphere 5.

You would use local SSD drives for stateless or nonpersistent desktops and fan out the number of desktops to reduce the overall risk in a production deployment. Persistent desktops (stateful) and any critical components would reside on the SAN, and the local SSDs would be used for low-storage high-I/O desktops like those provided through View Composer. This design, while possible, is not all that common because most SAN solutions now incorporate SSDs. The trade-off, however, is that at a certain scale one is likely to be more cost effective than the other.

Even with the best underlying measurements, you should always factor in the usage type of users consuming the virtual desktop environment. Generally speaking, usage type falls into three broad categories: low-, medium-, and high-end users. The point in planning for these broad categories of users is to make allowances in the hardware specifications. For example, say that from your PoC environment, you identify that most virtual desktop sessions are using about 2 GB of memory and a single vCPU with a 30 GB OS image. Rather than plan on the average, you should adjust the average with the usage types mentioned.

Taking an example, say that the production environment will service 500 desktops. As the IT architect for the company, you know that a large percentage of these desktops will go to engineers and designers, so out of the 500 seats you expect that 40% of those will be high-end users. The next largest portion of users has an average usage requirement and makes up another 40% of the population. The remaining 20% are extremely light users of the system. Your expected high-end desktop requirement is 2 vCPUs and 6 GB of memory, and your low-end user requirement is 1 GB and 1 vCPU. If you break this out, the planning starts to look like Table 1.

Table 1. User Segmentation

You can use the usage types to further refine your design to ensure your hardware estimates are accurate. You can then take the information gathered through either the capacity planning analysis or PoC and adjust it to factor in these usage types. This step is necessary because both capacity planning and PoC environments tend to provide a snapshot of usage versus actual. It is very difficult to ensure that you have captured data that represents exactly what you will see in production. There is really no single tool to do this, so you must combine what you know about the environment and your metrics to develop your hardware requirements. You can automate a good portion of this process by using tools such as the ones available from www.liquidwarelabs.com and www.lakesidesoftware.com.

If you are engaging a desktop replacement strategy where you must be able to justify the costs versus risks versus benefits, you might need to oversubscribe resources in your design. Justifying your design based on the cost per desktop and return on investment is a typical activity when you build a business case for VDI. With the focus on austerity and general move to reduce overall costs, it is important that you be able to speak to the cost per desktop. To get a better price point, you may run the environment at a higher rate of utilization to get a better price/VM or View desktop. For example, if you develop a conservative specification of 50% utilization, the hardware required to scale the environment may be cost prohibitive. You might need to oversubscribe the underlying physical resources to ensure the solution is both scalable and cost effective.

When you build scaled-out VDI environments, it is important to develop your specifications in blocks or logical groupings of servers, storage, and software. For example, you should know if you are scaling your solution to 10,000 desktops that a block of 5000 desktops requires 50 servers (an average of 100 desktops per server), 14 TB of storage, and 100 licenses of vSphere Enterprise. The reason for this is it makes your solution much easier to grow if you design your solution to scale in building blocks that equal a certain number of virtual desktops with a fixed amount of resources. In this way, your capital costs can remain consistent during your desktop replacement strategy.

Others

- VMware View 5 : Establishing a Performance Baseline (part 10)

- VMware View 5 : Establishing a Performance Baseline (part 9)

- VMware View 5 : Establishing a Performance Baseline (part 8)

- VMware View 5 : Establishing a Performance Baseline (part 7)

- VMware View 5 : Establishing a Performance Baseline (part 6)

- VMware View 5 : Establishing a Performance Baseline (part 5)

- VMware View 5 : Establishing a Performance Baseline (part 4) - Configure vCenter Operations

- VMware View 5 : Establishing a Performance Baseline (part 3) - Deploy vCenter Operations

- VMware View 5 : Establishing a Performance Baseline (part 2) - Create an IP Pool

- VMware View 5 : Establishing a Performance Baseline (part 1) - Installing vCenter Operations