Build Your Own SAN
Build Your Own SAN
Weve been focusing a lot lately on end-user technology stories, but we felt that the large contingent of IT folks who frequent ExtremeTech and work in small to medium businesses might really appreciate this story. And for all of you tech enthusiasts who want to get a step ahead of the crowd and implement a Home SAN--dont laugh, many of you will be doing this in the future--you may want to read this story to understand what goes in to building a SAN today. And you may also want to check out our earlier story called "
As recently as two years ago, the thought of building a Fibre Channel (SAN) Storage Area Network without the help of an integrator would have been foolhardy, if not impossible, due to the fact that interoperability in the Fibre Channel realm was erratic. Note that most SANs are still based on Fibre Channel networking technology, but the emerging iSCSI standard, and Fibre Channel over IP standards like iFCP are making inroads, and in a future story, well be getting deep into iSCSI technology, and also implementing iSCSI SANs. For this story, well stay focused on Fibre Channel SANs.
While the task of building a SAN can still be complicated and time consuming today, the good news is that we are finally at a state where this technology is accessible for mid-sized companies and even some small companies.
In this story, well walk through the major steps that all IT managers need to go through to build their SANs. In our first section we discuss the basic components of a Fibre Channel implementation and explain how to install and configure each of the components in a direct-attached storage configuration. Next, well discuss the challenges of storage networking and go over basic Fibre Channel switch configuration. In the end, well put together a small SAN implementation, with the basic structure seen below.
This story is designed to give readers an overview of the implementation of SANs, but it should be noted that there are several sources which we recommend for additional information. Robert W. Kembels book, The Fibre Channel Consultant: A Comprehensive Introduction (published by Northwest Learning Associates, Inc in 1998) describes in exquisite detail the inner workings of Fibre Channel and its protocols. For readers that are looking for a high level overview of Fibre Channel and SANs, we recommend Marc Farleys book, Building Storage Networks (published by Osborne/McGraw-Hill in 2000).
Well focus on three main sections including: SAN components and setting up direct-attached storage (DAS) with Fibre Channel; Configuration of Fibre Channel storage networks; and performance testing a SAN.
Call for Consistency in
First and foremost, before a SAN implementation can begin, many hours of research should be done ahead of time. By combing through a vendors Web site you can easily find out which operating systems are actively supported by that vendor and you can also learn which products are certified to work together.
The primary benefit of open standards--as opposed to proprietary technologies--is that you dont have to sell your soul to one vendor to get a working implementation. While Fibre Channel is indeed an open standard, its openness can stand some improvement. Interoperability in the SAN space is much improved compared to the past, but it is still far from being something we would characterize as plug-and-play across multiple vendors. Fibre Channel vendors and users will have to endure some interoperability glitches as the technology improves and becomes more mainstream, very similar to the development of Ethernet several years ago.
The key word for every SAN implementation should be consistency. Unlike Ethernet and other networking technologies, you cannot expect to buy random SAN components and slap them together with everything working efficiently out of the box. It is extremely important to remember that a poorly configured SAN will do more than just abruptly stopping your applications, it could lead to massive data loss and corruption.
For this reason we recommend that once you start building your SAN, you should try as much as possible to standardize on specific components to simplify troubleshooting and minimize the likelihood of interoperability based failures.
Given the current state of the market in the Host Bus Adapter (HBA) segment, we recommend that you choose a single HBA vendor and stick with it for the long haul. While it is possible to implement HBAs from several vendors together in a single SAN, from a management standpoint an extra vendor is an added headache, since you need to stay on top of the new firmware, drivers and management utilities that are constantly released by different vendors. It is also important to be consistent with your drivers and firmware for the HBAs. Occasionally weve seen communications errors occurring because of mismatched firmware and drivers within a SAN.
For Fibre Channel switches, the recent E-port spec should allow you to uplink switches from different vendors to expand SANs. It should be noted, however, that the E-port spec is still not universally implemented by all of the switch vendors. We highly encourage you to check the switch vendors site for interoperability information before making any purchasing decisions. Like HBAs, we recommend that you stick with one brand of switch, since most of the switch vendors (Vixel and Qlogic in particular) make excellent management tools which allow IT managers to manage all of the switches in their SAN from a single point (provided they all come from that vendor).
Storage Arrays and tape libraries are the only components that are OK to mix and match a bit to take advantage of pricing wars between vendors. JBODs (Just a Bunch Of Disk) units are popular in testing labs because of their low cost, but these units dont provide redundancy unless you run software RAID from your server operating system. RAID vendors like Hitachi Data Systems, which last year entered a partnership with Sun Microsystems, makes excellent high end systems while smaller vendors like MTI Corp. offer some interesting low end to mid-range RAIDs.
Installation and Definition of
Configuring a server to bridge between your Ethernet and Fibre Channel networks is a straightforward process. As long as you are prepared, consistent in your methodology, and keep in mind a few conceptual irregularities, these first steps can be fairly painless.
Begin by physically installing your Host Bus Adapter (HBA) in a 64-bit PCI slot in the server. Although 64-bit cards are generally backwards compatible with 32-bit slots, this usage will hamper performance to the extent that we would not recommend this configuration.
For this demonstration, we installed an Emulex LP8000 Fibre HBA in a couple of Dell PowerEdge 4-way and 2-way servers running Windows 2000 Server. As seen in the photo below, the Emulex card is equipped with an interchangeable GBIC (Gigabit Interface Converter), allowing you to swap ST and SC fibre connectors as needed (see more details on these connectors below).
It is prudent to label your HBA at this point. HBAs are quite similar in appearance to Gigabit Ethernet Fibre Adapters. As seen below, there are a few distinguishable differences between the types of adapters, but they may be hard to spot when the system is deployed in your server room or rack.
One best practice to consider is maintaining conformity in your installations between servers. Standardize on which adapter is installed in which slot (ie. HBA slot 1, GigE slot 3) to avoid confusion while administering different machines. These guidelines will help you avoid many troubleshooting headaches later on.
As you start the computer, Windows 2000 should recognize your adapter and begin the driver installation process. If the plug-and-play installation does not begin automatically, click START / Settings / Control Panel / Add/Remove Hardware to start the installation. Although Windows 2000 comes with default drivers for many Fibre Channel HBAs, its a good idea to make sure to have the correct and most current drivers handy--we downloaded version 1.3a1 of Emulexs LightPulse installation utility prior to installation from www.emulex.com. As evidenced below, the driver installation steps should be familiar to any Windows-aware administrator.
|Armed with the correct drivers, Windows 2000 makes this step a snap.|
To verify that installation completed successfully, check the Windows Device Manager. Right-click My Computer, go to the hardware tab, and hit Device Manager. Your HBA should appear under the SCSI and RAID Controllers category. Below, we have expanded our device list to show that, indeed, the driver was installed successfully and appears as a SCSI (not network) device.
Next, we advise checking the vendor Web site for any HBA firmware updates. Given the still tenuous compatibility between products from different vendors, it is in your best interest to consult the documentation for relevant compatibility information and upgrade the firmware if necessary.
Connecting Cabling Between Devices
We are now ready to connect the cabling between devices. There are a few different types of fibre cables (see table below). Your selection should depend primarily on your distance and throughput needs. It is extremely important to remember that devices are designed to work with specific cables. Plugging a short wave cable into a switch designed for long wave WILL toast your device.
|Single Mode (long wave)||Yellow||9 microns||2 m - 10 Km|
|Multimode (short wave)||Orange||50 microns||2 m - 500 m|
|Multimode (short wave)||Orange||62.5 microns||2 m - 300 m|
For our demonstration, we used a 62.5 micron multimode cable with ST connectors (see image on right below). Keep in mind that there are two primary types of fibre connectors: ST and SC. The ST connector is quickly gaining acceptance from product vendors, but many SC connections still exist. Be aware of your device connection types before making your cable purchase. Many devices allow you to purchase hotswappable GBICs to change the connection, or you can also purchase hybrid cables (see below left) to accommodate both connection types. Each type connector is keyed to make proper insertion a snap.
Multi-mode fibre cable with
both SC and ST connections.
Multi-mode fibre cable |
with ST connectors only.
Note in the cable on the left the ST connector is on top, and the SC connector is on the bottom.
With the physical connection between our server and our storage device (see below) established, we need to define the network architecture for our HBA in order to establish link. This step varies according to HBA vendor, where vendors provide an assortment of Windows, DOS, or BIOS-level applications to perform architecture configuration commands. For our demonstration, Emulex provides a handy Windows application.
|Fibre Channel RAID and Tape Library units can be shared on a SAN.|
Overland Data Incs LXN2000 tape library (left image above) has an optional Fibre Channel connector that we used to hook into our SAN. Hitachi Data Systems Thunder storage array (above right image) has Fibre Channel ports and it provides the shared storage for our SAN.
Configuration of Shared Storage
With our Fibre Channel network topology now defined, we are ready to configure the disks. As Windows 2000 loads, our HBA driver starts the storage virtualization process, which "fools" the operating system into thinking our fibre connected storage device is actually a directly connected SCSI device. With this deception, we are now able to configure the storage using standard Windows 2000 disk management utilities.
Right-click My Computer / Manage / Storage / Disk Management. The storage device appears as a new disk that requires a disk signature, or needs to be imported to the system, depending on the devices previous disposition. The Windows wizards will walk you through either step. Its important to note that in heterogeneous SANs, Disk Manager will not recognize the file systems of other operating systems. Writing a disk signature to these shared disks will ruin the data already stored on them. (See section on Zoning and LUN masking below)
In order to use the new SAN disks, we had to use Disk Administrator to format the new disks (right click on unallocated disks and choose format option). This method will allow you to add Fibre Channel volumes to your server, but its important to note that you will need volume management software like PowerQuests PartitionMagic or Veritas Volume manager to increase, decrease, or merge multiple volumes.
Our point-to-point SAN is now ready for use.
Configuration of Fibre Channel
The vast majority of current Fibre Channel implementations are Direct Attached Storage implementations like the one we set up in our tutorial above. While these are suitable for high performance storage for a single server, the biggest benefit of SANs are their ability to centralize storage into a shareable pool which all servers can utilize.
The biggest obstacle to centralized storage is not Fibre Channel or the software created to manage it--the problems come from server operating systems. Since operating systems like Microsofts Windows 2000 and Sun Microsystems Solaris arent designed to share storage with other servers, SAN hardware and software needs to be used to ensure that servers only have the ability to use assigned SAN storage resources.
Currently there are three major methods for controlling hosts and storage resources in a SAN:
- LUN Masking
- Storage Virtualization
-based SAN Zoning">
Switch Zoning is the oldest and least innovative technique for SAN partitioning. Switch zoning narrows down the traffic running through a storage-networking device so that specific ports on the switch or hub can only see other specific ports. For example, on a 4-port switch with 2 servers and 2 RAID storage units attached to it, by creating 2 zones within the switch (one zone with Server A and RAID A and a second zone with Server B and RAID B) we can force a server to use one of the assigned storage units. As a result, while the physical topology of this example looks like a star (with the Fibre Channel switch in the center) since we enabled zoning, the logical representation seen by the components is 2 separate networks.
While zoning is a simple way to divvy up storage resources, it does not take advantage of all the abilities of a SAN, and it is inefficient. For example, if you were to invest in an expensive RAID unit with 2 ports, you would only be able to hook this RAID up to two servers, since the switch ports are dedicated to one device only.
Implementing SAN Zoning
SAN Zoning is relatively easy to configure on most Fibre Channel switches. In the case of our Qlogic Corp. SANbox 2 Fibre Channel switch, we used their SANsurfer 2 management software to set up our zones.
Using SANsurfer 2s zone management interface we created a zone set for our switch.
After creating the zone set, we created two SAN zones within the set to accommodate two network segments.
Once the zones were configured, the process of assigning servers and RAID devices to specific zones was a simple drag and drop operation. We created a zone for each of our servers and assigned each of them a port from our two-port Hitachi RAID unit.
LUN masking, like zoning, prevents servers from seeing all but specific storage resources, but it is more efficient and granular since it can be used to control LUNs within a storage device. LUNs are either individual disks, groups of disks, or individual parts of multiple disks defined by a RAID controller. LUNs, which most people commonly refer to as partitions or logical disks, are granular storage entities that are carved out of a single storage system (be it a RAID or JBOD or even a tape library). Because multiple LUNs can reside on a single storage system, multiple computers can access the LUNs through a single wire connection to a storage system with LUN masking, a situation that is far more scalable than zoning, which is hampered by its 1:1 setup (one port: one connection).
With LUN Masking you can use a single Fibre Channel link to split up a RAID unit into multiple logical parts.
LUN Masking, which is usually carried out by intelligent Fibre Channel RAID controllers, ensures that the host operating systems can only see the LUNs that have been explicitly assigned to them.
LUN masking can be implemented quickly (in a matter of seconds), but it is important to note that for most RAID units like MTIs S200, the alteration of LUN masking settings usually requires a little bit of downtime as the controller reapplies its masks. As a common sense practice, you should never switch LUN masks while an application running on a server is still using the data on a LUN--even if the application seems to be idle. The easiest way to avoid data corruption is by scheduling downtime before doing any major activities.
LUN masking works well in small isolated SANs, but as the number of hosts and targets escalate, they can become somewhat unwieldy.
Storage Virtualization is the newest of the three management technologies and is far and away the most efficient and reliable technology. In a storage virtualization implementation, a storage controller sits between a pool of centralized storage and the servers, which have storage needs. With virtualization, IT managers can share storage resources to all servers regardless of storage hardware type (direct attached SCSI and IDE storage can be shared, as well as large Fibre Channel RAID units) or physical location (Fibre Channel links can run 10km in length).
The most common implementation uses a specialized server running storage virtualization software acting as the gateway between the storage and the servers (see diagram). Two solutions that worked well in our tests in house are FalconStors IPstor and DataCores SANsymphony virtualization software packages.
To make storage virtualization work, the SAN is configured to have two zones: a zone for the servers and a second zone for the storage. To communicate to both sides, the storage controller (i.e. the server with the virtualization software) runs two Fibre Channel HBAs (one dedicated to each zone) and the software routes the traffic from one zone out to the other.
Since the servers and storage are in different zones there is no danger of servers or users accidentally using and corrupting the data on the shared storage. On the server side, a specialized device driver, allowing the server to communicate with the storage controller, needs to be installed.
Once the software and hardware is configured, the storage controller will be able to distribute LUNs out to the servers easily through a central management console. For mission-critical sites, its extremely important to set up redundant storage controllers, since an outage at the storage controller will cripple all of the servers than rely on the SAN.
Since storage controllers are essentially acting like RAID controllers in a storage virtualization scenario, by beefing up a storage controller with RAM and additional processors, its possible to boost the overall performance of a SAN by doing some caching on the controllers.
Performance Testing a SAN
The best benchmarks are always the ones you run yourself using your own application to generate load. When it is not practical to benchmark this way, the next best alternative is to use a widely used benchmark suite that is accepted both by the vendor and user communities. In the world of storage, Intels older Iometer tool is the most well known storage system benchmark around.
Iometer is freely downloadable from Intels Web site, and since many vendors publish numbers using this benchmark, its possible to get some comparative data between different storage systems. For the most part, Iometer is very easy to use and since it doesnt require a large test bed to set up, it is a good tool to have around the lab.
Although Iometer doesnt use real application data to create its load, using the benchmark controls its easy to create test mixes to simulate your application needs.
Using Iometer you should create a test suite that behaves similar to your applications. For example if you are trying to support a streaming media application, you should configure a test suite where most requests are sequentially reading data from a storage system. Iometer has the ability to approximate both the percentage of read and write transactions and the percentage of random to sequential requests.
In most cases, to max out the throughput of a SAN storage system (to stress the storage devices and the network components together), we typically run a sequential read test with a large request size (around 1 MB). Unfortunately, there are very few applications in which a large sequential read is the only type of transaction required. Usually when vendors publish performance numbers on their subsystems, the large request sequential read test is the number they publish, but buyers should be careful to check the testing disclosure of a vendor test before accepting their numbers at face value.
Included with Iometer are sample test suites to simulate database server, streaming media, webserver, and file server traffic. These suites are a good baseline tests to run before formulating your own suites. Now if only Intel would keep updating the test to keep up with improvements in storage technologies.
While SAN implementation is still not an easy task, key tasks like storage centralization and disaster recovery can be greatly enhanced through the use of Fibre Channel SANs. Although we may be a few years away from having plug and play SANs such as those we might want to use in our homes in addition to our businesses, in many cases the inconveniences created by being on the cutting technology are far less than the rewards of implementation.