Microsoft jumps into the race to make print publications available to its search engine, following rival Google's Print Library effort. Unlike Google, Microsoft is vowing to make sure its effort is well-received.
Microsofts MSN Internet search group detailed its new effort to catalog books, academic materials, periodicals and other print resources on Tuesday.
Dubbed as MSN Book Search, the effort is part of the companys push to expand its search capabilities to find more of the information people seek when they come to its site and launch Web queries.
Microsoft officials estimated that users currently find the exact information they are seeking on MSNs search engine only 50 percent of the time, and said that addition of the publications to its index should help the company improve on that figure.
The Redmond, Wash.-based software maker released few details regarding the timeframe for launching Book Search, stating only that it plans to launch a beta of the product sometime in 2006 that indexes roughly 200,000 documents and publications.
"We need to bring new materials online to help address those unanswered questions that people put into the search engine," said Danielle Tiedt, general manager for search content acquisition at MSN.
Microsofts Book Search initiative follows a similar program launched last year by rival Google that has drawn heated criticism from authors and publishers for its inclusion of many copyrighted materials.
The Authors Guild, a nonprofit group representing authors, filed a lawsuit in September seeking to stop the Google Print Library Project, claiming the project violated their copyrights.
To read more about Googles Print Library Project, click here.
The Association of American University Presses and European officials such as President Jacques Chirac of France have also openly criticized the program.
For starters, Microsoft said that it has begun assembling the contents of many works considered to be in the public domain, or that are not covered by current copyrights. Search rival Yahoo Inc. has announced plans for a similar effort that collects entries from widely archived texts in order to avoid a confrontation with publishers.
Microsoft said it is going to great lengths to work with various organizations, including the Association of American University Presses, to ensure that people holding copyrights to the materials it indexes feel the program does not infringe on their rights.
Unlike Google, which has instituted a so-called "opt-out" program through which people can ask to have their works removed from its print effort, Microsoft said it would rely on an "opt-in" arrangement whereby it seeks permission to include many texts.
The company also said that it is working with the OCA (Open Content Alliance), a nonprofit group formed by the founders of the Internet Archive to create standards for digitizing publications, in order to avoid controversy over its plans. Yahoo is also working with the OCA to appease publishers.
MSNs Tiedt said that Microsoft is planning to build features into the search that let people bookmark certain pages, copy information for bibliographies, and create annotized quotes for reports.
The company may look to charge for some of the services down the road but has no concrete plans to that end, she said.
The company may also include technologies such as its ClearType software for making fonts easier to read in the Book Search offering.
On the topic of assuaging copyright issues, Tiedt said that Microsoft is proceeding with caution and seeking the advice of a number of parties to avoid the same type of scrutiny Google has attracted with its program.
"We know that there are still a lot of issues to consider around digitizing copyrighted content," Tiedt said. "In the long term, we want to get as many books as possible into the index, and were going to consult with authors, publishers and libraries in getting the information online, before starting to look at business models, or even advertising, to grow the product."
Check out eWEEK.coms for the latest news, views and analysis on enterprise search technology.