BOSTON—”The dogs will bark, but the caravan rolls on,” CIO Magazine Group Publisher Gary Beach said in his Tuesday keynote address at the inaugural GridWorld conference here.
Well, yes, the dogs are indeed barking at grid computing, but actually, most dogs I know serve as decent sentries when their humans are about to go get eaten alive by cannibals. Or, you know, software vendors or service providers—whatever.
In this, the inaugural show of what seems to be a mini-rash of oh-yes-indeedy-grids-sure-are-ready-for-the-enterprise conferences, Beach was addressing the business case for why leading IT organizations are adopting grid.
The barking dogs, in this case, are all those people who keep saying “Yikes, too scary!” to grid.
And there are, still, many enterprises saying yikes to grid. CIO Magazine did a survey in September asking whether enterprises intended to implement grid in the next 12 months. A somewhat impressive one in five said yes, they would.
Now, that compares pretty favorably to the 1 percent who said yes to grid one year ago, according to CIO Magazines surveys.
But judging by the now-familiar litany of technology roadblocks iterated by show-goers and panelists, it sounds like there are still plenty of reasons to bark.
The challenges: First, grid costs money. Second, its difficult to install. Third, security is a big question mark when youre taking sensitive data and spreading it over PCs like so much privacy-threatening margarine.
Fourth, standards work has improved to a degree, but some standards are still missing, and others have groups working at cross-purposes. Fifth, code modification headaches. Sixth, having enough IT staff in these lean, mean times. Seventh, its difficult to install. Eighth, its really hard to install. And did I mention its hard to install?
Vendors are, of course, scrambling over themselves to ease this baggage off your weary back.
IBM, for one, in August announced at LinuxWorld what its calling the Grid and Grow package. Its a grid starter kit of sorts that bundles its eServer BladeCenter blade server hardware with software and services.
The starting point is a $49,000 package that includes one BladeCenter chassis and seven blade servers, along with grid scheduler software for managing jobs and services to help plan, install and test the bundle.
Depending on the type of workload and the industry, IBM will offer Altair Engineering Inc.s PBS Professional, DataSynapse Inc.s GridServer, Platform Computing Inc.s Platform LSF or IBMs LoadLeveler software as its scheduling software.
The blades themselves will run Linux operating systems from either Red Hat Inc. or Novell Inc.s SuSE. They are also available with Microsoft Corp.s Windows or IBMs own AIX 5L.
IBM had other news coming out of the conference, including an agreement to partner with Univa Corp. to deliver commercially supported grid software from Globus. Univas going to deliver a commercially supported, enterprise-ready release of the open-standard software built around the Globus Toolkit for use across IBM eServer platforms running both AIX and Linux.
Big Blue is also working to fertilize the ecosystem. IBM announced with Absoft Corp. on Tuesday a new software developers kit designed specifically to work with Grid and Grow hardware and services.
Also, SAS Institute Inc., the first major vendor to join the Grid and Grow program, on Tuesday announced new grid computing capabilities in its SAS 9 BI (Business Intelligence) software.
The new automated grid management capabilities in the data mining and data integration applications will help users to more easily allocate compute-intensive work, helping to reduce data processing time and helping to get more data integrated and analyzed in less time.
Ken King, IBMs vice president of grid computing, said Grid and Grow is essentially IBMs recognition that a good 50 percent of the market really hasnt the foggiest idea what grid is or how to use it.
“Those who do are scared to spend money,” he said.
So the $49,000 price point not only targets departments in large industry accounts; it also targets the midmarket.
Thats a fine approach, but speaking from a casual sampling of walking the (very small) exhibit hall and talking to some show-goers, Id say that when were talking about grid, the overwhelming majority of users are still those belonging to the HPC (high-performance computing) camps of scientific and academic research.
In that camp, one typical type of user is exemplified by an application architect from the University of Pittsburgh who was mulling over the Grid and Grow sales spiel in the exhibit hall. He said hes working on creating a grid for various academic purposes, including, of course, supporting research for typical grid stuff: genomics, protein research, etc.
Hes amassing information, but the architect really plans to do things on his own, using both post-docs and academic computing wizards as human resource material.
Sorry, vendors. With that type of user—and as always, there still seems to be more of them than other types in the world of grid users—kiss the potential to sell services goodbye.
Also, hes looking at using open-source code, a la Globus Toolkit, since hes got supercomputing experts on hand and no lack of skills to tackle the care and feeding of open-source software.
But enterprises can certainly adopt grid without such a lineup of skills. On one panel that covered “The First Steps Toward Grid Adoption,” much assurance was given that you dont have to have experts or a ton of expertise on hand before embarking on a grid project, because believe you me, you will learn as you go.
Next Page: Do-it-yourself grid computing.
Do
-It-Yourself Grid Computing”>
Eric Bremmer, a professor at Children Memorial Hospital at Northwestern University, told about grid-enabling data and text mining for the systems on which the development for a biology knowledge base happens.
His is a small research organization attached to the hospital, but he needed to integrate some 150,000 articles from five years worth of 20 medical journals.
The problem was that more than 24 hours were needed to process about 5,000 articles on a single desktop computer. Another problem is that this stuff is time-sensitive: Scientific literature goes out of date rapidly, so youre only as good as your last update.
With the help of United Devices Inc., Bremmer got a small grid up and running. He said hes managed to decrease analysis time, going from 5,000 articles in 24 hours to about 100,000 articles in 24 hours.
Hes using the same computers that are either used by administrative staff or put to work on other research projects during the day, running the grid analysis work from 7 p.m. to 7 a.m. so as to stay out of administrators hair.
Turning to a service provider is the only resource for somebody like Bremmer, who has two research assistant professors and some post-docs, all of them medical types and none of them computer science types.
Another thing you need to worry about is whether youre in a regulated environment, he said. “Weve tended to move toward commercial software because we need to have it FDA regulations-capable,” he said. “And most open-source software is not because you cant lock it down, by nature.”
So you get some service providers in-house. But what sort of skills do you need to lead the project?
Wolfgang Gentzsch, a member of the GGF steering committee, a coordinator at D-Grid and visiting scientist at the Renaissance Computing Institute at UNC/Chapel Hill, said you have to know enough to define the different steps of projects; to watch over the service provider, who will be bringing in various parts of the project; you have to measure the projects success; and you have to know enough to report back to the next level of management.
But, Bremmer said, the training youll receive in the process is “remarkable.”
What are some of the pitfalls companies get into when they do grid themselves, without service providers?
DataSynapses Director of Business Development Dave Maples said his company often gets called into projects where somebody has attempted to build some kind of clustering or load-balancing project that they then proceeded to outgrow.
That means the grid wont scale, it didnt have enough power, and/or it didnt do resource identification very well.
Bremmer pointed out another issue: In a research lab environment, the biggest problem is when a post-doc or somebody leaves. “To have a tool built by that person [means that] typically the knowledge leaves with that person,” he said.
Next Page: Organizational and security aspects of grid computing.
Organizational and Security Aspects
of Grid Computing”>
Another issue to watch out for is the social engineering aspects of grid.
Gentzsch said hes witnessed situations where theres no real preliminary agreement on how to share the resources that get strung together in the grid.
One day youve got your own resources that you can touch and control, and then the next day youve got to share them with heathens from other departments. Or worse, they move away to a place where you cant see or touch them anymore.
How do you handle that? Get policies up front that everybody can agree on, he said.
For example, when he was working with a grid that combined resources from the Army, Air Force and Navy research labs, department chiefs met every Monday afternoon and agreed on how to use resources over the coming week.
Then of course theres always security. As Gentzsch said, weve got more and better encryption and other types of security than ever, but theres still a concern from users that resources, data and applications arent secure enough.
Especially in a regulated environment, when youre talking about patient data and patient records and about combining that with life science data or genome research, thats a highly sensitive area.
Beyond HIPAA (Health Insurance Portability and Accountability Act) issues, legal issues arise in countries such as Poland and France, where companies arent allowed to share financial data with anyone from outside the country. What are the implications for grid in such a situation? Nobody at the panel even knew.
Bremmer, for his part, is looking at different types of grids to address the security and privacy issues. The institute has its own firewall, as does the hospital with which its associated, and work is ongoing to figure out how to bridge the two.
That doesnt have to do with technology, however, as much as it does with figuring out who can access what.
“Thats a real issue, and its why, when youre [breaking down information silos], start with small types of projects,” he said. “Its important when you have the backing of IT that they can look at it, they can see how its operating in this environment, and they can see that its not causing security leaks. Theyre more comfortable, and theyll help you fight these battles.”
Another thing to keep in mind is that theres flexibility in how grid is architected. You can put information in areas that you would have secured anyway, even without grid.
If data is sensitive, you keep it behind firewalls, and you identify which machines have access to it. If its that type of application, it might stay inside a firewall, or it may stay on a prescribed set of machines.
Beyond that, as Gentzsch pointed out, there are different applications that fit different grids.
“Grid can exist on peoples laptops for very specific types of things,” he said. “Dont use proprietary or private data for only pre-research data. For more stuff that needs privacy, you build a very secure node in your grid for which special people are authenticated and authorized.
“There are these nodes today. Theres a very secure operating systems, like Trusted Solaris, where you can define different levels of security with different access keys for specific people. You put your data into one of those containers, and the whole thing is very safe.”
Of course, it always boils down to this: Start small. Define business performance issues and workload issues. Figure out which problem area to target. Profile workload patterns to establish cause and effect of pain points. Then, set up a prototype, and figure out how youll measure success.
Is it a lot of work? Oh, yes. The dogs bark, but the caravan moves on.
Check out eWEEK.coms for the latest database news, reviews and analysis.