Many features that end up in Microsoft Corp. products often start as projects in the companys research division. Rick Rashid, senior vice president for research, in Redmond, Wash., sat down with eWEEK Senior Editor Peter Galli at the Web 2.0 Conference in San Francisco recently to discuss search technologies; the "Longhorn" version of Windows; and what teams in Redmond, Beijing and Cambridge, England, are working on.
What can you tell us about Sapphire, the user-interface project designed to make the search for and storage of information more intuitive?
That work is pretty much now concluded as a research project, and we have made that technology available to the [Microsoft] product teams, so youll probably see some of those ideas show up in Longhorn, the next major release that we have coming out on the system side. One of the things Sapphire looked at was the notion of keeping track of everything you are doing when you are using a computer and keeping that in a database and then being able to cross-correlate and relate that.
So how exactly does it do that?
We looked at how users used their systems, and we found that some kinds of information were relevant and others werent. So, like what windows were up on the screen when I was doing this? Well, thats data we can keep; we know what it was. But we didnt find that it was super-strongly correlated with anything users cared about. One of the things people did care a lot about was time and things like when they worked on a document.
One of the things we also found is that these sort of timelined browsers that let you see what youve been doing shown as a timeline and maybe correlated with their calendar, weather and other events that are strong in their memory—that turns out to be a pretty good way for a lot of people to access information.
You have research teams in Beijing, Cambridge and Redmond working on search technologies. What are those teams focusing on, specifically?
There has been a fairly significant emphasis on information retrieval for a long time. More recently, we have been focusing a lot on working with the MSN [Microsoft Network] and Windows groups to transfer those technologies. What has been going on broadly within Redmond has been a greater emphasis on new ways of thinking about how to bring the users personal information to the search process. In China, there has been a lot of work on media search, image, video and audio search, and they have also been looking at Asian languages, but they also have some base information retrieval people. In Cambridge, we have people working on theoretical information retrieval algorithms on ways of representing search and browsing histories.
You mentioned that your research teams have been working closely with the MSN search teams. As search is the big issue today with competition growing between Google, Yahoo and MSN, how have you stepped up the focus of your team to work with MSN on a search engine?
Every day has a new big issue. We have had a lot of people working on information retrieval technologies for a long time—more than 10 years—and we have shipped a lot of stuff already in Windows and other products, like Encarta. Whats new is the fact that for things like Web search, we now have a product group that we can talk to, whereas before, Microsoft wasnt focused in that area on the product side, and so most of our energy was focused on the Windows or SharePoint or SQL [Server] or Office teams. Now, MSN is really getting into that space on their own, and so theres more of an emphasis on that.
Real-time blogs are the new communication frontier, and there is a lot of interest in technology around the indexing and searching of these. What is Microsoft Research up to on that front?
We are doing something in that space, but I cant talk about it because its with a product team now.
How about more personalized search?
We have been doing a lot of work around that for a long time, so this is not new work for us. We have long looked at what it means to personalize, how you think about that, the personalizing views, personalizing the access to information and being able to qualify information that is out there. One thing that the Web is very bad at is allowing us to specify which sources we trust and will accept information from. That is a different kind of personalization, one that says a lot of whats out there is junk and so allows you to select the set of data sources you trust and get that information first.