However, in order to get the most out of server virtualization, it is critical that other elements of infrastructure complement the environment–especially storage. Otherwise, a lot can go wrong. Applications can unexpectedly slow to a crawl. What is billed as a cost-lowering computing alternative can require significant investment to achieve full functionality. And using virtualization to improve application and server uptime can suddenly reveal painful weaknesses in other areas of your IT infrastructure. Let’s look at two of the most common pitfalls when it comes to virtualization.
Pitfall #1: Choosing the incorrect storage platform
One of the major benefits of server virtualization is the ability to move live guest applications between hypervisors on different machines. Whether this is done for scheduling, load balancing or disaster recovery, hardware independence is usually one of the major drivers behind any virtualization implementation. However, if your storage is tied to specific server hardware, moving applications becomes a little more complex, if not pointless.
Network-attached storage (NAS) is often used as a way to simplify storage provisioning for virtualized servers. NAS volumes are quite simple to implement and grow without getting the hypervisor involved. Unfortunately, the use of NAS has its own weak performance issues, and many applications (such as Microsoft Exchange) do not work well at all with NAS. For these reasons, most virtualization vendors will recommend Storage Area Network (SAN) storage for anyone looking for more effective application performance.
Fibre Channel SANs
With Fibre Channel (FC) SANs, users not only need to justify the increased cost of FC storage, switching and administration, but they also need to invest in costly host bus adapters (HBAs) for each server they connect to the SAN. Those companies with an existing FC SAN are not necessarily in the clear. To capture the major benefits of server virtualization, the complete FC infrastructure (including switches and HBAs) needs to support NPIV (N_Port ID Virtualization)–which excludes a large proportion of existing products.
Even with NPIV, VMware can only transfer guests between machines within a single FC zone. This means that, while the user has achieved hardware independence on the server side, all physical servers in a group capable of transferring guest applications between one another are dependent on a single FC zone (usually a single array or even one disk) for storage. Hardware independence on the server side can result in a dangerous multi-application hardware dependence on the storage side.
Optimal storage solutions for a virtualized environment
The Internet Small Computer System Interface (iSCSI) or IP SAN offers the best storage solution in a virtualized server environment, not only from the obvious cost advantage, but also in terms of the availability, agility and scalability of the virtual architecture. An iSCSI SAN storage system also offers significant advantages for companies using virtualization for wide area disaster recovery. Snapshots can be used at the storage level to duplicate data to a local or remote secondary site.
Additionally, an iSCSI SAN storage system has a significant WAN advantage over a FC SAN storage system. FC storage wide area replication requires the purchase of costly Fibre Channel over IP (FCIP) gateways. Wide area replication for iSCSI SAN storage requires no additional systems to acquire, implement, operate and manage. The iSCSI is a TCP/IP protocol that works natively across a WAN. Both FC and iSCSI WAN replication are subject to throughput droop over distance or from packet loss. This can be mitigated by WAN or TCP/IP optimizers for iSCSI SAN storage. These same WAN or TCP/IP optimizers have little to no effect on FCIP gateways.
Pitfall #2: The oversubscription dilemma
Even with the correct SAN solution in place, applications moved to a virtualized environment can sometimes slow to a crawl. If the server hardware configuration is correct, this can leave administrators baffled as to the cause. In such cases, storage is often the culprit.
Many of the infrastructure efficiency savings of virtualization are achieved by using hypervisors to deliberately oversubscribe physical resources. Virtual guest applications are assigned a suboptimal share of physical resources, on the principle that all of the applications are, statistically, unlikely to require resources all at the same time. Used in proportion, the principle typically stands up in practice. However, most SANs and SAN storage already use oversubscription, and the result of dual-layered oversubscription of physical storage resources can be disastrous.
With the storage infrastructure genuinely over-stretched, contention becomes a problem, bottlenecks occur and buffers overflow. To make matters more complicated for the administrator, these contention issues can occur at multiple levels within the storage infrastructure.
At the individual disk level, the queues for input/output (I/O) requests simply fill up. This problem is particularly acute with slower Serial Advanced Technology Attachment (SATA) drives where the queue depth is typically 0-32 requests against the 256-512 request capacity found in Serial Attached SCSI (SAS) or FC disk. This means that companies looking to implement a virtual infrastructure–who also want the option to use lower-cost SATA drives for tiered storage–need a SAN solution which does not restrict their choice of disk on the backend.
At the storage logical unit number (LUN) level, the hypervisor itself typically carves a physical storage pool, or LUN, into multiple virtual LUNs. These are then assigned to different virtualized guest applications. The physical LUN cannot distinguish between these guest applications and over-contention decreases the storage performance.
Similarly, oversubscription at the hypervisor level can also cause problems at the SAN infrastructure level with HBAs, initiators, ports and switches. These resources are often oversubscribed by a ratio of 8:1 or more by the SAN itself. The compound effect of this dual oversubscription can go beyond a performance drop and actually lead to request timeouts and application crashes.
Using virtualized SAN storage to contend with over-contention
One option is to switch off the storage virtualization function within the hypervisor and manually ascribe LUNs to each guest application. However, this is discouraged by many vendors and leads to a loss of key virtualized functions.
Another option is to deal with the problem from the storage side and reduce the levels of native oversubscription within the SAN architecture. With a physical SAN, this is complex and will dramatically reduce the efficiency of the SAN for non-virtualized hosts. With virtualized SAN storage, this reconfiguration is not only far simpler, but hypervisors can often be treated differently to physical hosts to optimize the overall SAN efficiency.
Indeed, a virtualized SAN can also be used to spread individual LUNs across multiple storage resources to alleviate the contention issues yet further. Virtualized SAN storage provides SAN storage performance with NAS simplicity.