Tuple Data Model Faces Real World

 
 
By Peter Coffee  |  Posted 2001-08-27 Email Print this article Print
 
 
 
 
 
 
 

Distributed-systems model takes on IT roles

A tuple is neither an exotic fungus nor an adults-only entertainment. Defined with misleading simplicity as "a series of typed values," the tuple can be to distributed computing what a base pair is to a molecule of DNA: Tuples carry information and provide their own form of organization—in a manner that may seem inefficient—but they enable adaptation to situations not foreseeable when a system was conceived.

A simple tuple might be something like "John Doe, 2/25/1980, 123-45-6789": a series comprising a character string, a date and a Social Security number. Pattern-matching algorithms readily match such "tuple signatures," with or without mechanisms for "dont care" values or for binding pieces of tuples to the variables in a computer program.

Tuples, combined with pattern-based retrieval, give IT systems the same flexibility as a large unsorted pile of papers—exploiting, to some degree, the decline of hardware costs compared with human costs during the first several decades of IT development. A costly data administrator or library scientist can try to anticipate the exact criteria for retrieving information at some future time, creating an administrative burden of classification and sorting; alternatively, a less expensive clerk can simply put everything in one pile and search, in response to ad hoc requests, for "bills due this week" or "bills with past-due amounts" or even "bills with green logos at the top." This is content-based retrieval, as opposed to the more rigid order- or address-based retrieval schemes traditionally imposed on IT architects.

Depending on the application, a process might generate a tuple for inspection by other processes (as when publishing a request for data or service); might inspect, without altering, a tuple from another process (as when accessing shared data); might locate, read and destroy a tuple (as when granting a request that will no longer need attention from any other process); or might search for a tuple with particular features, waiting until such an entry becomes available before proceeding (as when offering a service and awaiting requests from clients).

If one process produces tuples with actual data values, while another process uses a template that will match any tuple with that signature of value types, then these processes can engage in procedure-call interaction without regard to where (or when) each process is running.

Tuple spaces (network-addressable repositories of tuples, shared by cooperating processes) thus provide a framework for distributed computing in environments, such as mobile and wireless networks, that dont fit the crucial assumptions (such as fast, persistent, synchronous links) that are integral to traditional IT models.

Crucially, the tuple-space model conceals underlying data representation and database architecture decisions from the applications that use the repository. An application might initially be supported by a simple data model that does not scale well with size or might use a single server that creates a single point of failure, but these initial choices could be replaced by more robust technology without affecting the applications flow of operation.

However, flexibility invariably comes at a cost, and the tuple-space model should not be perceived as a practical alternative for conventional database tasks. Relational databases, for example, effectively encode design-time knowledge about the nature and meaning of records and tables to enable powerful and general query operations. Tuple spaces dont attempt to match this strength; what they provide is persistent storage of data with unpredictable structure and a relatively short useful lifetime.

Object databases offer their own distinctive advantages in representing objects with a complex structure, potentially incorporating hierarchical relationships (containers of multiple objects, some of which might themselves be containers) with fewer complex queries than might be needed by relational systems.

However, a tuple space, although likewise based on object-oriented matching of data types and encapsulation of behaviors as well as data, will not typically be designed for such complex relationships and will not replace an object database as a transparent, persistent extension of an applications transient object pool.

Sun Microsystems Inc.s JavaSpaces (at java.sun.com/products/ javaspaces) and IBMs TSpaces (at www.almaden.ibm.com/cs/TSpaces) are IT-oriented implementations of tuple-space communication, offering enterprise developers new freedom to explore distributed technologies without first devising cumbersome (and possibly dead-end) infrastructures.

TSpaces technology is already available for license from IBM and is featured in the companys mobile technology demonstrator—a fully wired Ford Explorer, dubbed the alphaWorks TechMobile, which debuted at IBMs Solutions conference in San Francisco earlier this month.

A moving vehicle presents a constantly changing resource environment, with varying quality of network service and with vehicle occupants entering and leaving the zone of its wireless connections. Tuple spaces enable dynamic matching of user needs with local vehicle systems and remote network services.

TSpaces serves as the TechMobiles "soft backbone" middleware, integrating voice recognition, Bluetooth wireless links and even eye-motion tracking input for vehicle control.

IBMs Enterprise TSpaces, scheduled to appear next month, will extend TSpaces with data replication, automatic state recovery and dynamic partitioning of tuple spaces to move this technology up the food chain from small-scale networks and pilot projects to larger production systems.

 
 
 
 
Peter Coffee is Director of Platform Research at salesforce.com, where he serves as a liaison with the developer community to define the opportunity and clarify developers' technical requirements on the company's evolving Apex Platform. Peter previously spent 18 years with eWEEK (formerly PC Week), the national news magazine of enterprise technology practice, where he reviewed software development tools and methods and wrote regular columns on emerging technologies and professional community issues.Before he began writing full-time in 1989, Peter spent eleven years in technical and management positions at Exxon and The Aerospace Corporation, including management of the latter company's first desktop computing planning team and applied research in applications of artificial intelligence techniques. He holds an engineering degree from MIT and an MBA from Pepperdine University, he has held teaching appointments in computer science, business analytics and information systems management at Pepperdine, UCLA, and Chapman College.
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...
 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
 
 
Thanks for your registration, follow us on our social networks to keep up-to-date
Rocket Fuel