Research Head Plots the Future

Under Richard Rashid, Microsoft research works on advances in data mining, security and software development techniques.

While once-vaunted corporate research labs such as Xerox Corp.s Palo Alto Research Center Inc. struggle, Microsoft Research continues to grow steadily in reputation and size under Richard Rashid, senior vice president of research at Microsoft Corp.

Rashid, who founded MSR 11 years ago, led the development at Carnegie Mellon University in Pittsburgh of the Mach operating system before joining Microsoft. The code from that software has become part of at least three Windows operating system rivals: the Free Software Foundations GNU Hurd, Hewlett-Packard Co.s Compaq Tru64 Unix and Apple Computer Inc.s Mac OS X.

While MSR has yet to produce seminal innovations such as those that came out of PARC, its work is seen in many Microsoft products. Senior Writer Anne Chen and Technology Editor Peter Coffee met with Rashid earlier this month to discuss MSR and how its research will affect enterprise computing. The following is an excerpt. For the complete interview, go to

eWeek: Are there particular technologies youre developing right now that you expect will make a big impact on enterprises?

Rashid: There are a lot of things going on. In the data mining area, its about the total issue of how to manage and federate large databases and do data mining against that. Weve recently been working with the Sloan Digital Sky Survey and the National Virtual Observatory on the notion that you can use the emerging XML Web services infrastructure to create federated databases so that you can then do data mining against multiple organizations. ... The same kinds of technologies are applicable in corporate settings as well.

A lot of exciting work is going on in the areas of security and trust, especially notions of being able to build federated trust infrastructures. Some of that work is already starting to show up in things like the new security spec for the [Global XML Architecture]. I think the goal there is to say how can we build an infrastructure for exchanging information that allows people to have many different kinds of trust defined, to have different notions of what a license is for accessing information and where the trust for that particular license comes from.

Im really jazzed about the notion that we can automate a lot of procedures that today cost corporations a huge amount of money in terms of maintaining their IT infrastructure. Weve been putting a lot of research energy into technologies that automate database management tasks. A lot of that boils down to developing the fundamental algorithmic base to manage the state of a large number of systems, to recognize various kinds of issues or problems that occur and be able to take corrective actions.

And in the networking space, theres a lot of exciting stuff going on in terms of how to build ... extremely high-performance networks and isolated trust environments. ... Even though many things may be connected, you will have effectively private subnets within a larger network environment. There needs to be new ways of creating effectively trusted subnetworks within a corporate LAN, so its not just one big open space.

eWeek: Whats the time frame for seeing these types of features in products?

Rashid: Already, with .Net server, there will be some technologies that we delved into in research that will help to automate a lot of the management tasks that exist in a corporate environment. By the time you start to see the next consumer version of Windows that follows .Net server, youll begin to see more and more trust technologies built directly into the system. Support for IP [Version] 6 and [IP Security] you see already in Windows XP will be much more tightly integrated in that time frame ... and [there will be] much better support for automating all sorts of standard management tasks.

eWeek: Do you watch people unpack a new Windows machine and really examine the trajectory of how they go from knowing nothing? What have you learned?

Rashid: Whats becoming really exciting ... for us is [products that have] the ability, if the user chooses, to report back to us when theres a failure. So with Office XP—which really introduced this notion—and Windows XP, theres a feature that, when theres a crash, you can, at your choice, have the system report back information to Microsoft about what went wrong. Our research group works with the product groups to do data mining on that information, and thats hugely valuable. Were finding the opportunity to really be able to understand whats happening to people in the field and relate that back to the code to try to identify the main things that are causing people grief. We can then go back and fix our own code or work with device driver writers, independent hardware vendors or other software vendors.

eWeek: Is this relatively new technology starting to yield statistically useful amounts of information?

Rashid: Yes, its actually very dramatic. If a new device driver gets released with a problem, well know about that within a few days because youll immediately see it pop up in the statistics and then we can talk to the manufacturer. If the error was ours, then we can immediately try to fix it and make that fix available through Windows Update.

One of the things Im really excited about is a group in research called the Programmer Productivity Research Center. One of the things theyve been working on is this notion that you can build a lifetime XML database of information about the applications and the systems we build that sort of go from the original specification, record that and try to turn it into a much more mathematical specification. All the way through the development of code, we can keep track of whos written what part of the code and keep track of all the test cases that have been run. When we get error reports from the field, if weve got that database, we can then relate it back to the original code and find the people who worked on it.

Being able to manage that whole software life cycle in a way that is more scientific has been a dream in the software universe for a while, and I think were getting close to really being there. With the .Net server release, thatll be the first time a number of these technologies will have been used throughout the entire development cycle of a product, and were certainly seeing a significant reduction in certain types of errors in our testing.

eWeek: How does the Trustworthy Computing initiative extend into the lab?

Rashid: One of the ways it extends in is, through our Programmer Productivity Research Center, we have developed a number of technologies for doing automated, statistical analyses of applications.

In an early form, just as Windows 2000 was about to hit the streets, was a tool that does path-sensitive static analysis of software and can look for a variety of errors, many of which are security errors such as buffer overruns. With Windows XP, that was refined. With .Net server, weve pushed that technology even further. Now its a required part of the build process for developers to use those tools. Its part of this process to really try to make a more trustworthy development process internally where a lot of common errors can be removed before we even get into the testing phase.

Another area weve put a lot of energy into, which hasnt had quite as much impact on the product group yet, is this notion of creating executable specifications for software. Its been used in a couple of our product groups already. Its relatively esoteric, but its something where were able for various kinds of applications now to literally describe them in such a way that even though the spec looks like English, its still executable. We can test the application against the spec once the application is written. Were trying to expand that use as time goes on. Technology transfer is a full contact sport, and, in this case, its really a full contact sport.