How to Deploy Higher-Level Building Blocks for Web 2.0 and Cloud Computing Data Centers

Today’s data centers are at a critical juncture in their development. The full potential of Web 2.0 and cloud computing technologies has been hindered by spiraling power costs, unprecedented complexity, and limitations in the existing IT architectures that support these technologies. Existing architectures were never designed to support the rapid growth of data, users and traffic in the Web 2.0 world. To addresses these challenges, the industry is beginning to move to “data center 2.0,” where new approaches to data management, scaling and power consumption give businesses the room they need to grow.

These 2.0 data centers leverage standard low-cost x86 servers, Gigabit Ethernet interconnect and open-source software to build scale-out applications with tiering, data and application partitioning, dynamic RAM (DRAM)-based content caching servers and application-layer node failure tolerance.

These loosely coupled architectures have enabled service scaling-but at a very high price. Today’s data centers are reeling from the high costs of power, capital equipment, network connectivity and space. They are also hindered by serious performance, scalability and application complexity issues.

Advances in multi-core processors, flash memory and low-latency interconnects offer tremendous potential improvements in performance and power at the component level, but adapting them to realize such benefits requires major engineering and research efforts. Because Web 2.0 and cloud computing enterprises must focus on their core business, higher-level building blocks are needed that can exploit these advanced technologies.

Multi-core Processors

Multi-core processors

Multi-core processors place many processors and shared caches on a single chip, providing very high potential performance throughput for workloads with thread-level parallelism. To fully realize the benefits of advanced multi-core processors, applications and operating environments need to have many parallel threads with very fast switching between them. They also need to support memory affinity and have granular concurrency control to prevent serialization effects.

Flash memory

Flash memory is a non-volatile computer memory that can be electrically erased and reprogrammed. Flash memory has many promising characteristics but also many idiosyncrasies. Flash memory offers access times that are one hundred times faster than those of hard disk drives (HDDs) and requires much less space and power than HDDs. It consumes only 1/100th the power of DRAM and can be packed much more densely-providing much higher capacities than DRAM.

Flash memory is also cheaper than DRAM and is persistent when written, whereas DRAM loses its content when the power is turned off. Flash memory can be organized into modules of different capacities, form factors and physical and programmatic interfaces.

However, flash memory access times are much slower than DRAM and flash memory chips have write access behavior that is very different than their read access behavior. Flash memory writes can only be done in large blocks (~128KB) and, before writing, the region needs to be erased. Also, flash memory has limits on how many times it can be erased. As a result, small writes need to be buffered and combined into large blocks before writing (write coalescing), and block writes need to be spread uniformly across the total flash memory subsystem to maximize the effective lifetime (wear leveling).

The latency, bandwidth, capacity and persistence benefits of flash memory are compelling. However, incorporating flash memory into system architectures requires specific design and optimization-starting at the application layer, throughout the operating environment and down to the physical machine organization.

Incorporating Flash Memory

Incorporating flash memory into overall system architecture

A very high degree of parallelism and concurrency control is required in the application and server operating environment in order to utilize the tremendous potential I/O throughput and bandwidth offered by advanced flash memory technology.

Also, flash memory driver, controller, device optimization and tuning are required to match to workload behavior-especially to access size distributions and required persistence semantics.

High-performance interconnects

Interconnects have come a long way since Ethernet first became popular in the 1980s. Bandwidth continues to increase while latencies are steadily getting smaller. Today, Gigabit Ethernet (GbE) is standard on most server motherboards. 10GbE is being used in data centers mostly as a backbone to consolidate gigabit links and is starting to gain traction as a point-to-point interconnect.

With latencies as low as a single microsecond between server nodes, it is feasible to distribute workloads across multiple servers and to use replication to multiple server nodes to provide high availability and data integrity. Nevertheless, most applications available today were written with the assumptions of high latencies and slow bandwidth. The software to manage data movement at such high speeds while running simultaneously on multiple server nodes is very complex.

Loosely Coupled Scale-Out Architectures

Loosely coupled scale-out architectures

A modern Web 2.0 and cloud computing data center scale-out system architecture deployment has at the front a Web server tier and an application server tier (sometimes merged together), and in the back-end a reliable data tier, usually hosted by database servers, which are typically slow and expensive elements. They often operate at very low CPU utilization due to blocking on HDD accesses, lock serialization effects and low HDD capacity utilization due to having to minimize head movement to reduce access latencies.

Between the Web server tier and the back-end server tier are a content-caching tier and specialized application services, which may perform generic functions such as search, ad serving, photo store/retrieval, authentication or specific functions for the enterprise. Completing a response to a customer interaction involves accessing a Web server, application servers, database servers and various other generic and specialized applications and servers.

Data centers generally require that user responses complete in less than a quarter second. A DRAM-caching tier, consisting of servers filled with DRAM, usually completes this. Customer information, data retrieved from slow databases and common user interaction results are cached in this DRAM tier so they can be accessed very quickly.

Since the performance of a site can be improved dramatically through extensive caching, many racks of caching servers are typically deployed, with each holding a limited amount of DRAM, so the data must be partitioned among the caching servers. IT staff need to carefully lay out the data between the caching servers, which typically operate at very low network and CPU utilization, as they are simply storing and retrieving small amounts of data.

When loosely coupled scale-out architectures are examined closely, it becomes clear that the database and caching tiers suffer from very low utilization, high power consumption and excessive programmatic and administrative complexity-all contributing to high TCO.

Challenges of Utilizing New Technologies

Challenges of utilizing new technologies

The new generation of commodity multi-core processors, flash memory and low latency interconnects offer tremendous potential in Web 2.0 and cloud computing data centers. But in reality, due to the extensive work to implement, the benefits are limited. The effort required to utilize these new technologies to solve today’s severe performance, power, space and TCO challenges is significant. IT teams need to develop highly parallel middleware applications, a high-performance operating system, and develop and optimize numerous specialized configurations.

Adapting or inventing new deployment architectures to take advantage of the new technologies is a major undertaking (with large development and support costs which are not the core value of the businesses). Fortunately, new higher-level building blocks are now being introduced which address these challenges.

This cannot happen soon enough, as exploding demand for services from today’s Web and cloud computing data centers has placed existing architectures and technologies on a collision course with service availability.

Dr. John Busch is President, CEO and co-founder of Schooner Information Technology. John has more than 25 years of industry experience. Prior to Schooner, he was research director of computer system architecture and analysis at Sun Microsystems laboratories from 1999 through 2006. In this role, he led research explorations in chip multiprocessing, advanced multitier clustered systems for deployment of Internet-based services, and advanced HPC systems. He received the top President’s Award for Innovation at Sun, and oversaw many division technology transfers. Prior to Sun, John was VP of engineering and business partnerships with Diba, Inc., and was general manager of the Diba division after Sun acquired Diba in 1997.

From 1989 to 1994, he was co-founder and CTO/VP of engineering of Clarity Software, and led creation of advanced multimedia composition and communication products for Sun, HP and IBM computer systems. From 1976 to 1993, he led many successful R&D programs at Hewlett-Packard in Computer Systems Research and Development. John holds a Ph.D. in Computer Systems Architecture from UCLA, a Master’s degree in mathematics from UCLA, a Master’s degree in computer science from Stanford University, and attended the Sloan Program at Stanford University. He can be reached at dr.john.busch@schoonerinfotech.com.

How to Deploy Higher-Level Building Blocks for Web 2.0 and Cloud Computing Data Centers

John Busch

Company

Categories