IBM is working with the Coriell Institute for Medical Research to help the biobank operate the IT systems for its cryogenic freezers and better manage the 4.5 million samples of personalized genetic data.
Founded in 1953, Coriell is a nonprofit biomedical research institution and the largest biobank of human living cells. It runs the Coriell Personalized Medicine Collaborative research study, which aims to explore genome-informed personalized medicine.
"There's an awful lot of information we can glean from someone's genetic makeup to help us determine what drugs will work for that person or what complex conditions they might have a predilection for throughout the course of their life," Scott Megill, Coriell's CIO, told eWEEK.
The biobank sought IBM's help to find a way to manage the data in a way a nonprofit could afford while maintaining huge volumes of data, Megill said.
DNA extracted from blood cultures is stored for many years. With one person's genome equal to 2 million points of data and about 1.5GB of information, storing this mass of data is a challenge.
"The sheer volume of data that's generated from genetic testing is unlike anything that we've seen before," Megill said. "It's almost a terabyte of information that's generated for one patient. It's really incumbent on us to put good tools and infrastructure in place to simply make that actually comprehensible."
Customizing treatments based on an individual's genes brings great potential for treatment of patients with conditions such as cancer, diabetes and heart disease. The data can better inform the decisions of physicians as they care for patients.
IBM monitoring software allows Coriell to protect its genetic samples from cryogenic freezer failures and by using the IBM XIV Storage System, it has reduced storage costs by 30 percent.
XIV is a thin-provisioning grid storage platform that allocates space on demand, Megill noted. Moving to thin provisioning and using XIV's grid mode allowed Coriell to significantly reduce the amount of storage space it consumes.
"Every file is divided into very small blocks and then spread evenly across the entire bank of drives, rather than files that go deep on a couple of drives in the array," Megill explained. "It means you can get the same sort of throughput out of the XIV using much slower spinning disks," he added, referring to standard 7,200-rpm drives.
"In the traditional storage area network array, you'd have to use much faster spinning disks to achieve the same throughput, and they're much more expensive," he said.
Meanwhile Tivoli Omnibus and Netcool dashboard applications monitor servers and databases in cryogenic tanks and freezers to keep them operating properly at the right temperature.
"They basically collect data from different places and pull it together into dashboarding and alerting so that we know that a freezer is starting to get too hot or that a particular segment of our network isn't performing within acceptable thresholds," Megill said.
To manage the genetic data in laboratories, Coriell uses IBM WebSphere Lombardi Edition BPM (business process management) software, now integrated into IBM Business Process Manager.
"What Lombardi allows us to do is work directly with the process owners to determine what the application should do in a way that we've really not been able to do in traditional programming methods," Megill explained.
In Lombardi, Coriell uses flowcharts, or storyboards, to visualize the process of laboratory information management and inventory control.
"As advanced technologies have become affordable and available, Coriell is able to keep costs down and increase efficiency while also driving innovation in the area of personalized medicine," Andy Monshaw, general manager of IBM's global midmarket business, said in a statement. "Aligning the right technology infrastructure to meet its big data challenges, Coriell is well-positioned to promote tomorrow's medicines and treatments to help usher in a new era of medicine."
With one person's genome soon to equal 3 million points of data, Coriell will turn to IBM's technology to continue to manage the data overload.
"We've got a heck of a challenge ahead of us as we kind of enter into this world of full genome sequencing, so we're continuing to lean on IBM to help us take what's a very small organization and meet that challenge with tools that are appropriate to the scale of the effort," Megill said.