By: Frank Ohlhorst dnu
Many IT professionals are wondering if the exponential growth of disk storage and the associated management chores will ever slow down. Regrettably, for those charged with the day-to-day administration of enterprise storage, the answer is likely no.
Several factors are forcing the growth of storage pools, ranging from basic drivers, such as robust applications, to more complex factors, such as “big data” analytics, compliance requirements and disaster recovery. Simply put, storage is growing with no end in sight, and management is becoming more complex and time-consuming. That all adds up to higher costs, reduced efficiencies and longer backup windows.
F5 Networks thinks it has an answer to that dilemma (and several others) in the form of its appliance-based ARX storage virtualization platform, which focuses on abstracting physical file storage from its native, closed, management tools and virtualizing it into a centrally managed, easy-to-control layer. With an entry price of around $30,000, ARX combines several capabilities that make it a welcome addition to any large enterprise network that is struggling with storage issues.
Of course, F5 Networks isn’t the only player in the storage virtualization game. Vendors ranging from FalconStore to RedHat to even Microsoft (with the forthcoming Windows Server 8) have products that can be classified under the storage virtualization realm. However, what makes F5’s product really interesting is that ARX does this without having to make any changes to the original storage devices or management software. It leaves well enough alone and raises virtualized storage to its own layer, which can be centrally managed with ease.
The real trick here is that the ARX works as an intelligent proxy, converting abstracted, generalized storage access requests into something that a native storage solution can understand. Operating like a file storage “router,” the ARX is able to provide universal access to heterogeneous storage without impacting performance or forcing changes to the underlying storage technologies. What’s more, ARX brings additional functionality to existing storage solutions by allowing the creation of dynamic pools, storage tiers and so on.
Taking a Closer Look at ARX
The F5 ARX Series of appliances is designed with one key goal in mind: to make enterprise storage easier-easier to manage, easier to provision, easier to secure and, most importantly, easier to use. The ARX Series is available as four different physical rack-mount appliances and the ARX Virtual Edition, a virtual appliance. The five solutions have the same functionality, but differ in performance levels and scale.
The ARX1500 is the entry-level physical appliance in the series and is designed for 3.2G bps of throughput, 3,000 users and 768 million files. Each of those specifications will prove to be very important for those sizing a file virtualization appliance. For example, since a storage virtualization appliance abstracts storage hardware and acts as a proxy for file access, file counts and user counts prove to be of critical importance, as does throughput, so adopters of the technology should choose carefully and consider future network growth and file storage needs.
The big iron in the ARX series comes in the form of the ARX4000, which is rated for 12G bps of throughput, can handle as many as 12,000 users and tops out at 2 billion files. The 4U rack-mounted unit also sports two dual, redundant hot-swappable power supplies, twelve 10/100/1000Mb Ethernet ports and a pair of 10Gb X2 (MM-SC) Ethernet ports.
I was able to perform hands-on testing of an ARX2500 appliance at F5’s Lowell, Mass., location. The ARX2500, which began shipping in July, is rated for 8G bps of throughput, 6,000 users and 1.5 billion files. It also supports GbE, although this specific device was attached via 10 GbE to a storage environment that consisted of several flavors of file storage, including NetApp, EMC VNX and Windows 2008 file servers.
Setting up the ARX
Physical setup of the device consisted of little more than mounting it in the rack and connecting the appropriate power and Ethernet cables. The device is designed to support several deployment modes with single or multiple VLAN options, while providing both in-band and out-of-band management capabilities. When inserting the device into the network, it is important to make sure all your cables, subnets and ports are configured properly, ensuring that the device can route all of the storage available on the network.
Initial setup of the device is straightforward. A browser-based GUI is used to launch the setup wizard, which steps an administrator through the basic configuration of the device. One of the first things that needs to be done is licensing the unit. Internet connectivity makes that task a lot easier, so setting up the appropriate networking settings, as well as management IP addresses takes priority. Of course, that would take place after naming the device, configuring the active ports and assigning virtual IP (VIP) addresses. These VIP addresses prove critical for using the device, as we’ll see later. There is room for improvement with the setup wizard, which F5 says it has improved in an upcoming release of its software operating system. However, any apt network engineer should be able to work through it trouble free as it stands now.
After initial setup and licensing is completed, provisioning virtual storage becomes the next goal of the setup process. Similar to provisioning storage on a traditional file server, one starts out by defining a “namespace”- essentially, the collection of CIFS shares and NFS exports through which users will access their files. However, the ARX’s namespace is slightly different, in that it comprises storage from multiple storage devices, as we’ll see in a little bit. But before we can do that, we have to define a few parameters. Namespaces on the ARX can work with CIFS or NFS (and its variants), or multiprotocol, depending on what storage protocols you have already implemented on the network. The same can be said for security; you can choose from Kerberos, NTLM and NTLMv2. Other settings at this juncture include the proxy user account. (With CIFS, that is usually an Active Directory account that has the appropriate rights so that the ARX can access storage.)
After a namespace is created, it is time to populate it. At a high level, a namespace is just a collection of virtual file systems that are grouped into containers called volumes. Think of a volume as comprising all the shares in your home directories, group shares or application workspace, for example. The idea is to construct a virtual file system that federates multiple physical file systems on the various storage devices behind the ARX. The ARX device builds an index of the physical file systems, which in turn is used to create the pointers necessary for the device to act as a proxy to the physical files, while representing those files virtually to the end user.
After you have defined the volumes, the next step is to define what F5 calls, “shares.” For those of you already familiar with networked file systems, a share on an ARX is essentially either a CIFS share or an NFS export. As the name implies, shares publish the virtual volume content to the users of storage, allowing them to access their data via the ARX proxy. These shares exist under a virtual service that can export one or more of the virtual volumes to provide seamless access to data that has been virtualized by ARX.
While the process of creating a namespace sounds complicated, the GUI makes it straightforward and provides ample help. It only took me a few minutes to set up a virtual volume, and share it so that it was available to users. That is the magic provided by the ARX’s wizard-based setups. Of course, a lot goes on behind the scenes to make virtual volumes work properly, but the ARX appliance handles all of that heavy lifting. IT managers just need to provide the appropriate information to make everything work. Volume creation is probably one area where the most care should be taken. However, if you do make any mistakes, it is comforting to know that the ARX does not change anything on the physical storage device, allowing you recreate virtual volumes, virtual file systems and most anything else with ease.
At this point, it’s probably worth a note on how one would actually deploy the ARX appliance. For this test, I created a namespace from scratch in a test environment. It was easy enough; however, most people already have a storage environment with existing users, file data, file shares and storage devices. They will be happy to know that the ARX provides an option that minimizes deployment headaches, which F5 refers to as a “namespace takeover.” Here, the ARX appliance actually takes over all of the identifying details of the storage environment it’s virtualizing, from storage device IP addresses and fully qualified domain names (FQDN) to the individual share names. The ARX can provide a virtual IP address for each device it takes over to make the environment after virtualization appear identical to the one before.
Creating Policies on the ARX
Naturally, ARX does a lot more than just provision virtual storage for use by end users. The ARX platform also provides additional capabilities, ranging from simplified storage migration to intelligent storage tiering to extensive management tools and policy controls. According to F5, this is where the real value of the solution lies. I had the chance to play around with some of the policies. Although I was obviously in a small test environment, even this limited experience was intriguing.
Similar to namespaces, creating a policy on the ARX is a relatively straightforward affair using a wizard-based setup. The ARX applies policies at the volume level to control the movement of files between the various physical file systems within each volume. I’ll talk about how this plays out in each of the ARX’s primary use cases below.
The first use case that F5 talks about is data migration-essentially, the movement of files from one physical file system to another. I performed a simple test that simulated a migration between two storage devices. I started with an ARX volume presented over the network as a CIFS share and comprised of a single physical file system on a Windows file server. After creating some files in the CIFS share, I verified that they were in actuality created on the Windows server. I then used the ARX’s GUI to provision a second physical file system into the volume, this time from a NetApp device. In the policy wizard, I selected the Windows device as the source and the NetApp device as the destination, initiated the migration and watched as the files simply appeared on the NetApp device. During the entire process, the files were visible to the user in the CIFS share, regardless of whether they were physically located on the Windows or NetApp device.
The next use case was storage tiering, a pretty common topic these days. The ARX tiers files in an innovative way, making the best use as its design as an intelligent proxy. For this test, I used the NetApp as “Tier 1” and Windows as “Tier 2” and created a policy to automatically move files from the NetApp to the Windows device over time. For the first test, I scheduled a policy at one-minute intervals to move files unmodified in that interval to Tier 2. I then created some files in the CIFS share and watched them get created on Tier 1. After about a minute, they were moved down to Tier 2 as expected. The second tiering test involved what F5 called a “placement” policy. Here, files are placed on the designated tier as they are created, instead of being moved there after a period of time. Using a common example, I created a policy that automatically placed all MP3 files on Tier 2. Next, I created an MP3 file on my desktop and copied it to the CIFS share, watching it get placed on Tier 2. Files in both tests were always visibile in the CIFS share, regardless of whether they were physically located on Tier 1 or Tier 2.
In the final test, I created a capacity-balancing policy to balance utilization across several file systems. Here, I created several 500GB physical systems across both a NetApp and EMC device. Then, I created a new ARX volume comprising these file systems and presented it on the network as CIFS share. As expected, checking the properties of the CIFS share on my PC showed the aggregate storage capacity of the physical file systems. The next step was to create the capacity-balancing policy to balance new file creation. Then, I created several files, watching as the ARX automatically placed one file in each of the physical file systems. F5 says that this is a growing use case for its customers, as applications become more and more data-intensive. Some applications require very large file systems beyond what is capable from a single device. This capacity-balancing capability allows you to construct a very large virtual file system comprising capacity from multiple physical file systems or storage devices behind it.
Data Protection
While my testing didn’t simulate disaster recovery in the test environment, F5 did point out a couple of interesting items regarding data protection. The first was with backup. F5 says that the storage tiering and capacity-balancing policies can actually help their customers reduce the amount of time required to perform a full backup of their data. For example, a “last modified” tiering policy essentially separates changing and unchanging data among different physical file systems. You could back up these different physical file systems at different intervals-changing data on a weekly basis and unchanging data less often. And if you have multiple physical file systems (such as in the capacity-balancing test), you can back up each of the physical file systems in much less time than it would take for the virtual file system.
The second item was what F5 calls “virtual snapshots.” Many enterprise IT organizations have come to depend on the snapshot capabilities of their NAS systems for disk-based backup and recovery. But what happens when you virtualize those NAS systems? F5 took me through the ARX virtual snapshot capability, which essentially does the same thing for snapshots that the namespace does for file systems. To test the virtual snapshots, we added to the setup left over from the capacity-balancing test with a third file system from a Windows file server.
We then created a snapshot rule on the ARX for the virtual volume. The snapshot rule is what tells the underlying storage devices-in this case the NetApp, EMC and Windows devices-to take physical snapshots of these file systems at a specific schedule. After the first snapshot was performed, we went to the snapshot directory and verified that we could see all of our files in the same directory, despite the fact that they are on different physical file systems and, hence, different physical snapshot images on three devices.
Conclusions
While the F5 ARX series proves to be a significant investment, the truth is that an ARX device reinvents the organization’s relationship with storage. In other words, the new storage paradigm offered by ARX reduces management overhead, increases flexibility and brings ease to building storage pools that can be dynamically reshaped, instantly, to meet the elastic needs of any organization. What’s more, ARX leverages existing storage solutions, which may preempt the need to buy more storage. Another notable fact about ARX is that it does not “get in the way” of performance. The device’s ability to handle all the processing at line speed means that network performance is not affected.