Saturday, January 12, 2008

Choosing the right virtualization technology for your environment

Fewer and fewer IT organizations are asking whether they should virtualize systems. The focus is now on how should they leverage virtualization in their environment. The maturation of virtualization solutions in the x86 and UNIX realm has opened the door to an endless array of choices. More choices offer organizations greater flexibility, but they can also introduce confusion and complexity. Every virtualization technology operates in a slightly different manner. This is compounded by the fact that every IT environment is vastly different, with its own unique operating patterns, technical compositions, and business constraints. Because of this, there's probably never going to be one ideal virtualization technology for every IT scenario. Thus, it's better to focus resources on choosing the right technology for a specific situation.

Following are six factors to consider when evaluating virtualization software.

1. Mobility and Motioning
Motioning enables applications to move between physical servers without disruption. Available on VMware's VMotion, XenMotion, and IBM P6 LPARs, motioning has the potential to transform capacity management. However, it's not without its problems. Motioning can introduce volatility and create vexing challenges for management groups tasked with incident management and compliance issues. To gauge whether motioning is a good option in the environment, organizations first need to analyze maintenance windows, consistency of workload patterns, and disaster recovery strategies.

Maintenance windows - When combined on a single physical platform, maintenance windows become intermingled. This can easy create scenarios where there is no window of time available for hardware maintenance. The same problem arises for software freezes. The ability to motion virtual machines can alleviate this problem by allowing servers to be moved offline for scheduled maintenance or software updates. Alternatively, without motioning in place, the proper initial placement of applications on virtual hosts is extremely important. In either case, making the right placement decisions is critical, since the mere act of motioning may constitute a change that violates a software freeze.

Consistency of workload patterns - The advantages of motioning may vary widely depending on the level of volatility in workload patterns. It can be very useful to leverage space capacity in highly volatile workloads. However, those benefits diminish in low-volatility scenarios.

Organizations can analyze the ideal placements based on the variations in utilization patterns for a day or a week or both. If patterns do not vary widely from day to day, a static placement may be sufficient and the volatility of motioning avoided. If the patterns are significantly different from day to day, a more dynamic solution is warranted.

Disaster recovery strategy - If application-level replication or hot spares are part of the disaster recovery plan, motioning may undermine these efforts. For example, one might inadvertently place a production server in the same locale as its disaster recovery counterpart. To avoid such pitfalls, organizations undergo a detailed analysis of disaster recovery strategies, roles, cluster strategies, cluster roles, and replication structures.

2. Overhead and Scalability
There are numerous aspects of an operational model that may impact the success of virtualization. These include the way I/O is handled, the maximum number of CPUs per VM, as well as the way vendors license their software on the platform. Organizations can bypass these overhead and scalability concerns by considering the following factors.

I/O rates - Software components such as database servers that are I/O intensive may be better suited to virtualization technologies that don't use virtual device drivers, as these device drivers "tax" the CPUs with every I/O transaction they perform, causing the system to hit its limits before it otherwise would. Techniques such as VMware's raw device mapping also provide higher efficiency in this area, but the use of such features prevents motioning.

To determine the best approach, organizations can use a strategy-specific overhead model that adds up CPU utilization numbers based on the I/O activity on the physical servers. This is an easy way to catch any workload types that are unsuitable for a given virtualization solution.

Non-compute intensive applications - Theoretically, it is possible to place many non-compute intensive applications together on a virtual host. However, there are many factors that may limit the scalability of this scenario. Moreover, pinpointing which factors are constraining the environment can be tricky.

The first step is to apply a CPU "quantization" model. If a technology uses a virtual-CPU-per-physical-CPU model that is rigid in nature, the number of virtual systems is limited by the number of CPUs. While this issue is fading away as new fractional models become available to allow for allocation of fractions of physical resources, it is still prudent to be aware of this constraint to prevent unpleasant surprises.

Memory is a more complex part of the equation. Applications that aren't doing very much will often utilize the same amount of memory as similar applications that are more active. Combining even a small number of these applications can quickly tax the memory capacity of the target system while making very little impact on CPU utilization.

The scalability of the underlying architecture also complicates matters. Some applications will crash when running too many images, regardless of what they are doing. Others leverage robust backplane interconnects, caching models, and high context switching capacity to allow for a maximum number of virtual machines without compromising reliability. To determine if moving to "fat nodes" makes sense, organizations factor in platform scalability and the workload "bend".

3. Software Licensing Models
Some applications are not supported on specific virtualization technology. Even if support isn't an issue, however, software licensing models may play a major role in the ROI gained from virtualization. If, for example, applications are licensed per physical server, this drastically minimizes any hopes for gains from virtualization endeavors. This leaves businesses seeking a physical configuration that will support the workload, typically requiring abandonment of vertically scaled infrastructures in favor of smaller, commoditized servers deployed in a horizontally scaled fashion.

4. Security
Organizations often have little guidance for ensuring security in a physical-to-virtual transition, as there are no published guidebooks or best practices for securing a virtual environment. The following are key considerations to minimize risk.

Security zones - Mixing security zones in a virtual environment is a bad idea, since most virtualization technologies don't provide for strong enough security isolation models. For example, it wouldn't make sense to place systems that are connected to sensitive internal networks on the same physical host as those connected to a DMZ.

In addition, many virtualization solutions have administrator-level roles that allow for viewing the disk images of all the virtual images. This introduces huge vulnerabilities by allowing sensitive security zones to be bridged. This problem is exacerbated with virtualization solutions that have internal network switches that control traffic between VMs on the same physical host. These technologies allow virtual systems to completely bypass all established port-level firewall filters, deep packet sniffers, and QoS rules governing traffic in the environment. This opens up the entire environment to threats that cannot be detected by network-level security tools.

Information privacy - Many virtualization technologies allow access to information stored in offline virtual images by simply mounting them as disk images. While this gives users added convenience, it also introduces considerable liabilities, making it easy for someone to walk off with a hard drive. In addition, this makes it all the more important to be careful when virtualizing any application that leaves residual data in temp files or other local storage.

5. Financial Differences
Savvy businesses undergo "what if" scenarios to determine the most lucrative virtualization solution for their environment. This includes taking into consideration licensing costs, implementation expenses, and hardware/software savings. For example, the cost of implementation varies widely based on the cost of transitioning to next-generation servers and storage, application engineering, etc. Moreover, if application-level changes are required, costs can skyrocket. As a general rule, if functional or user acceptance level testing is involved, ROI quickly disappears.

Another consideration is the type of hardware in use. Virtualizing high-density physical infrastructures such as blades vastly reduces the server footprint, but the cost of associated cooling systems may outweigh the benefits. Likewise, the use of fat nodes and large vertically scaled servers offers high scalability and efficiency but come with higher up-front costs. Commoditized rackmounted servers are simple and easy to deploy, but since they share fewer hardware components they may be less effective in offering economies of scale.

6. Chargeback Models
An important and often unexpected issue arising from virtualization is the lack of viable chargeback models. This means virtual environments must be designed so that they do not cross departments. If chargebacks are in place, the solution must provide a way to obtain accurate utilization information to ensure equitable billing for compute resources.

Conclusion
The sheer volume of virtualization offerings makes choosing the right solution a confusing and often overwhelming task. Having the foresight to analyze key factors impacting the virtualization effort enables businesses to avoid critical missteps. Moreover, conducting "what if" scenarios that analyze business and technical constraints-from security to workloads-drives more informed decision making that will ultimately mean the difference between success and failure in this complex era.

No comments: