With a growing number of users of an exploding collection of data—originating and residing in heterogeneous systems—NASAs Earth Science Data and Information System Project presents challenges that are recognizable to many enterprise IT builders.
Applying the principles of an SOA (service-oriented architecture) and taking advantage of the service discovery capabilities of this years Version 3 update of the UDDI (Universal Description, Discovery and Integration) protocol, National Aeronautics and Space Administration and its contractors have streamlined data access while also enabling a richer ecosystem of customized applications for analysis and interpretation.
As changes in climate, polar ice coverage and other environmental parameters become urgent global concerns, NASAs SOA modernization is improving earth science understanding and enabling better policy decisions.
In the beginning
Chartered in 1990, earth science Data and Information System Project, or ESDIS, has a broad portfolio whose key tasks include project management, systems engineering and technical direction of systems that archive and distribute earth science data, along with the definition of high-level standard data products.
The roots of the system go back four decades to the United States first Earth-observing satellites. Those resources now comprise multiple constellations of active spacecraft as well as many legacy data sets.
NASA defines 12 different domains in which earth science information has beneficial applications: agricultural efficiency, air quality, aviation, carbon management, coastal management, disaster management, ecological forecasting, energy management, homeland security, invasive species, public health and water management.
These varied interests entail radically different combinations of urgency, complexity and security. Space-based systems also yield petabytes of data, pushing the state of the art in data visualization and creating both technical and managerial challenges.
The bottlenecks of conventional database architectures and application development techniques dont suit the growing diversity of emerging needs for this information, said Robin Pfister, lead information management system engineer for the ESDIS effort at NASAs Goddard Space Flight Center, in Greenbelt, Md.
“In 2000, we had one human-machine interface for data search and access,” said Pfister. “Due to different standards and search approaches used in distributed archives, result sets were difficult for an end user to evaluate.”
Results were not formatted consistently, she added, and about 25 percent of the results that should have been returned from any given search were not seen by users because of inconsistent availability.
Growing commercial use of earth sciences data has yielded a growing variety of commercial off-the-shelf software, but Pfisters ESDIS team was concerned that adopting any such commercial solution might not meet requirements for extensibility and evolution.
As a result, five years ago—long before SOA was a mainstream buzzword—Pfisters team began working with contractors to define an SOA as the foundation for ESDIS future platform.
“The SOA-based approach gave us flexibility to support the community at the enterprise level beyond what we originally thought,” Pfister said.
“The fact that this solution can support a wide range of providers has attracted a lot of interest from other areas of NASA and other agencies, in both science and nonscience applications.”
ESDIS built a metadata clearinghouse and order broker dubbed ECHO, a somewhat- strained acronym for Earth Observing System Clearinghouse. Operational development of ECHO began in late 2000, with availability to users beginning in November 2002.
The resulting freer flow of data, though, highlighted weakness on the applications side, according to Pfister.
“While trying to meet the general needs of all users, the old system left most users less than satisfied because they couldnt perform tasks specific to their science needs,” she said.
“As a result of stakeholder interviews during formulation of ECHO, we learned the data access needs and expectations of the NASA Earth Research community were changing,” Pfister continued.
“We extended the objectives of the system to accommodate these known changes and to be flexible to accommodate future, inevitable changes.”
The expansion of the project charter to a broader services perspective began in mid-2004. The registry became operational in the second quarter of this year.
Avoiding a path of ungovernance
Blueprint Technologies Inc., of Vienna, Va., worked as a subcontractor to Global Sciences & Technology Inc., also in Greenbelt, in developing NASAs ESDIS SOA solution.
Michael Burnett, principal architect for Blueprint, said the projects proponents knew that they had to prove the value and the capability of the basic ECHO platform before it would attract the commitment of data-providing partners.
“The first several operational releases of the system were focused on collecting data because without that we didnt think the services side would take off,” Burnett said.
“After we were confident in the ability and the richness of the data registry, and wed built in security and management features, we moved into the services registry.”
Pfister, meanwhile, promoted the system to the users whose participation and support were crucial to its success.
“Our science community was already somewhat frustrated by being confined to the single-user interface and needing to find data from one location and services from another and having to pull that all down to their workstations,” Pfister explained.
“They were excited but not immediately believers in ECHO. I identified opportunities where potential service and data partners would be meeting, and we made sure that wed be there and presenting posters and giving papers. When we found individuals who were interested, we made sure we kept in touch. ECHO is nothing without the partners.”
Blueprints technology road map for the project avoided unnecessary risks: ECHOs normalized data model currently resides in an Oracle Corp.
Oracle Database 9i database on Sun Microsystems Inc. servers running Solaris 9. Migration to Oracle Database 10g and Solaris 10 is planned for early next year, said Pfister, in what she referred to as the “Version 8” release of ECHO to its users.
Global and Blueprint this year adopted Systinet Registry from Systinet Corp. for publication and governance of ECHOs services. “We chose Systinets registry technology for its strong standards-based support, including support of the UDDI V3 standard,” said Burnett. “We also found Systinet Registry provided the flexibility we needed for our diverse user community.”
Version 3 of UDDI, formalized in February, is at the core of the ESDIS modernization of NASAs earth resources data capability—dramatically improving the flexibility and responsiveness of this resource in meeting national needs.
As is often the case with a rapidly developing technology, UDDI may be on many enterprise IT architects mental lists of things theyve examined and found to fall short of their needs.
Version 3 of the protocol, however, made key advances in several areas that make it more than a mere look-up list of services—paving the way for the dynamic composition of new applications that the ESDIS team hoped to offer.
UDDI 3 makes UDDI keys more convenient to use, in much the way that DNS (Domain Name System) names are more convenient than simple IP addresses. UDDI 3 incorporates digital signature mechanisms for greater confidence in using UDDI with external parties.
It also offers policy management and a subscription interface, making it an effective platform for services that can reach out to each other in value-adding ways.
“Echo provides several fundamental shifts for the Earth Observation community,” said Blueprints Burnett. “First, as a metadata registry, ECHO allows for many different providers from different organizations to publish their data in … a normalized data model.”
Users can then discover data of interest by searching for key metadata characteristics without prior knowledge of the existence or source of particular data sets, he added.
The use of UDDI goes further, though, by establishing what Burnett and other UDDI proponents have called a marketplace environment—one in which data and applications, packaged in service interfaces, can be offered, discovered and used.
Crucially, services also can be chained together. “The ECHO infrastructure enables interoperability between these resources,” Burnett emphasized, so that “data from multiple providers may be combined as input to an algorithm—a service—from a different service provider.
ECHO allows a user to discover an interesting piece of data and ask this fundamental question: What can I do with this data?”
ECHO can then offer pointers to services that might be relevant. “Weve allowed service brokering—one data publisher, a service from another, and well support and broker that—its the first step toward service chaining, and thats all included in [ECHO] Release 8,” said Burnett.
What makes this a breakthrough in earth sciences data is also what makes ECHO a model for enterprise IT—that is, the ability to integrate raw data and abstract analytic services into front-line decision support applications.
In the case of ESDIS, this might mean more accurate and timely warnings of hurricane landfall locations; in the enterprise, a similar approach could knit raw information on warehouse stocks and shipping flows into valuable guidance for store-by-store promotions or other opportunities.
ECHOs benefits, NASAs Pfister said, have been both technical and organizational. A pleasant surprise to users, she said, has been “the ability to plug in specialized search modules that meet domain-specific needs—for example, one that improves accuracy of result sets beyond that provided in standard database search engines.”
ECHO users are devising custom data interfaces—18 at the most recent count, according to Pfister.
What were once separate data sets are now accessible through a common data model with 60 million registered items, Pfister added, and a 100 percent rate of returning all relevant results to user queries.
In the long run, though, the nontechnical benefits may be even more interesting: “It may be too early to tell,” Pfister cautioned, but “early activities show increased collaboration in the community where members are sharing solutions and services that may ultimately lead to more efficient solutions to science problems, applications and decision support.”
Technology Editor Peter Coffee can be reached at [email protected].