After 15 years of collecting DNA data about autism, the group Autism Speaks has brought together a huge amount of data about some 12,000 people affected by autism. Now it is uploading about 100TB of that data to the Google Cloud Platform, where for the first time it can be stored in one place and accessed more easily by researchers from around the world.
The transfer of the data to the Google Cloud Platform was announced June 10 by Robert Ring, the chief science officer of Autism Speaks, the world's largest autism science and advocacy organization, in a guest post on the Google Cloud Platform Blog.
The DNA data is being sequenced by the group's AUT10K program in collaboration with the University of Toronto's Hospital for Sick Children's Centre for Applied Genomics, with sequencing for about 1,000 cases already completed and an additional 2,000 other samples nearing completion, wrote Ring.
The huge amount of data and the research that is continuing are the key reasons for the move to the Google Cloud platform, he explained. "From the beginning, we realized that the amount of data collected by AUT10K would create many challenges. We needed to find a way to store and analyze massive data sets, while allowing remote access to this unprecedented resource for autism researchers around the world."
That's why the Google platform was chosen, he wrote. "In the beginning, we shared genomic information by shipping hard drives around the world. Downloading even one individual's whole genome in a conventional manner can take hours—the equivalent of downloading a hundred feature films. And by the time AUT10K achieves its milestone of 10,000 genomes, we knew we'd have a database on the petabyte scale."
Using the Google Cloud Platform, Autism Speaks researchers can store data and enable real-time, collaborative access among researchers around the world, wrote Ring. "We are in the process of uploading 100 terabytes of data to Google Cloud Storage, and from there, we can import it into Google Genomics. Google Genomics will allow scientists to access the data via the Genomics API, explore it interactively using Google BigQuery, and perform custom analysis using Google Compute Engine."
The key benefit of the data transfers and central storage is efficiency, he wrote. "Researchers will spend less time moving data around and more time analyzing data and collaborating with colleagues. We hope this will enable us to make discoveries and drive innovation faster than ever."
About one in 68 children in the United States is on the autism spectrum, according to Ring. "Caused by a combination of genetic and environmental influences, autism is characterized, in varying degrees, by deficits in social communication and interaction, along with the presence of repetitive patterns of behavior, interests or activities. Many individuals with autism also face a lifetime of associated medical conditions (e.g. anxiety, sleep problems, seizures and/or GI symptoms) that frequently contribute to poor outcomes."
The use of the Google Cloud Platform by autism researchers can drastically improve the group's research and knowledge, he wrote. "Together, we hold the capability of accelerating breakthroughs in understanding the causes and subtypes of autism in ways that can advance diagnosis and treatment as never before."
Google's connections with health care are deep. In March 2014, Google expanded its involvement in medical science around the world by joining the Global Alliance for Genomics and Health as part of an effort to expand and advance genomics research that could keep humans healthier. Some 146 organizations from some 21 countries around the world are members of the group so far.
As part of its efforts to bring innovation to the genomics alliance, Google is proposing the use of a simple Web-based API to import, process, store and search genomic data at scale, as well as a collection of in-progress open-source sample projects built around the common API, according to an earlier eWEEK report.
In September 2013, Google launched a new health care company, called Calico, to find ways to improve the health and extend the lives of human beings. The startup is focusing on health and well-being, and in particular, the challenge of aging and its associated diseases, according to Google.
Calico wasn't the first health care-related push undertaken by Google. Back in 2008, Google launched its Google Health initiative, which aimed to help patients access their personal health records no matter where they were, from any computing device, through a secure portal hosted by Google and its partners, according to earlier eWEEK reports. Google Health shut down in January 2013.