The next project for this database pioneer takes shape in the form of StreamBase Systems Inc., a company thats churning out software designed to process, analyze and act on real-time data "within milliseconds of its arrival." Stonebraker is StreamBases founder and chief technology officer.
StreamBase announced its Stream Processing Engine at the DEMOConference on Monday in Scottsdale, Ariz. eWEEK.com Database Editor Lisa Vaas recently got a chance to talk with Stonebraker about the issue of real-time data analysis, about how it leaves relational databases in its dust and, most importantly, how this cutting-edge technology is poised to transform our society. Financial services comes to mind, of course, but what really fires up Stonebraker are prospects like revolutionizing the care of emergency-room patients, the care of soldiers on the front lines or simply the ability to find your child when shes lost at Disney World.
Youve said that streaming data on the fly is something that ordinary relational databases cant handle. Why?
Heres a quick, simple little problem. This was a pilot we were asked to do early on. [It was] a large, mutual funds company. They subscribe to every feed on the planet, [including feeds such as Reuters]. They have a current application that watches each feed to determine if the data is late, so they can say, "Dont trust Reuters now, the feed is screwed up."
They defined "late" as [when the] inter-arrival time of ticks between the same stocks is greater than a certain number. You see an IBM tick, and if you dont see another IBM tick in x seconds, its an indication of late data.
They wanted to issue an alarm if you saw a late tick. Then they wanted to say, "If you see 100 late ticks that are coming from the feed vendor, then ring the red telephone."
The current application is written on top of bare metal in C++. They were unhappy with the performance of the current application, and it was hard to maintain. And expensive.
On this application, they said, "How fast can you go?" We processed about 150,000 messages per second on this, on a $1,500 PC, a commodity piece of hardware. Their current production application does about 3,000 messages per second. The best we could get out of one of the very popular relational databases was 900 messages per second.