Not content with organizing billions of Web documents, Google Inc. is leading the charge in turning library collections into searchable digital content.
In announcing Tuesday that it is working with five major libraries to scan millions of books for inclusion in its Web index, Google opened another battle in the intense competition among the leading search engines.
Its major search competitors will likely respond by further expanding their own indexes with sources outside of traditional Web pages, analysts said.
Meanwhile, Googles step into becoming a digital library drew enthusiasm as well as uncertainty from librarians. They were optimistic that the project would raise the profile on libraries in the age of the Internet but worried that book collections might get lost in the sea of searchable information on Google.
"This is valuable content," said Allen Weiner, a research director at Gartner Inc. "Weve been focused on Web content, which has varying degrees of value, but this has a built-in marketplace and built-in demand."
Googles library project is part of the Google Print effort it started testing early this year and launched as a beta in October. Through Google Print, the Mountain View, Calif., company is working with publishers to include digital versions of books and periodicals in its search index of about 8 billion documents.
For the library project, Google is partnering with the New York Public Library and the libraries of Harvard University, Stanford University, the University of Oxford and the University of Michigan.
Scanning the libraries collections will take years, but Google already has made a small percentage available from its search engine, said Susan Wojcicki, Googles director of product management.
Google has reached different arrangements with the libraries, each of which has a collection ranging from 7 million to 15 million books. It will scan the entire collections of the Stanford and Michigan libraries, while it will digitize works from 1900 and earlier at Oxford, Wojcicki said.
Harvard and the New York Public Library are starting with pilot projects of a subset of their collections.
"This is something we wanted to do when the company started, and it was the vision of founders before they even started Google," Wojcicki said, noting Googles origin as a library digitization project at Stanford. "This happened to be a time where Google had enough resources to take on such an endeavor."
Google earlier this year raised $1.7 billion in one of the years most closely watch initial public offerings.
Google is classifying books into three categories to deal with copyright issues. For works in the public domain, Google plans to make the full text available as part of search results.
For those under copyright, Google will work with publishers to determine how much of the text will be shown, Wojcicki said. Where it has no publisher relationship, Google will show short excerpts or only bibliographical information.
Google also plans to display its sponsored links alongside the text of books where it has a publisher relationship and to share with publishers a portion of pay-per-click revenues, Wojcicki said. With public-domain works and the excerpts, no ads will be displayed.
In a preview page about the library results, Google also is displaying links for buying a book at an online bookseller or for borrowing it from a local library.