Googles Library Project Could Drive Content Contest

 
 
By Matthew Hicks  |  Posted 2004-12-14 Email Print this article Print
 
 
 
 
 
 
 

Google's project to turn book collections from five major libraries into searchable digital content marks the latest shift toward moving search engines beyond the Web, experts say.

Not content with organizing billions of Web documents, Google Inc. is leading the charge in turning library collections into searchable digital content. In announcing Tuesday that it is working with five major libraries to scan millions of books for inclusion in its Web index, Google opened another battle in the intense competition among the leading search engines. Its major search competitors will likely respond by further expanding their own indexes with sources outside of traditional Web pages, analysts said.
Meanwhile, Googles step into becoming a digital library drew enthusiasm as well as uncertainty from librarians. They were optimistic that the project would raise the profile on libraries in the age of the Internet but worried that book collections might get lost in the sea of searchable information on Google.
"This is valuable content," said Allen Weiner, a research director at Gartner Inc. "Weve been focused on Web content, which has varying degrees of value, but this has a built-in marketplace and built-in demand." Googles library project is part of the Google Print effort it started testing early this year and launched as a beta in October. Through Google Print, the Mountain View, Calif., company is working with publishers to include digital versions of books and periodicals in its search index of about 8 billion documents.
For the library project, Google is partnering with the New York Public Library and the libraries of Harvard University, Stanford University, the University of Oxford and the University of Michigan. Scanning the libraries collections will take years, but Google already has made a small percentage available from its search engine, said Susan Wojcicki, Googles director of product management. Google has reached different arrangements with the libraries, each of which has a collection ranging from 7 million to 15 million books. It will scan the entire collections of the Stanford and Michigan libraries, while it will digitize works from 1900 and earlier at Oxford, Wojcicki said. Harvard and the New York Public Library are starting with pilot projects of a subset of their collections. "This is something we wanted to do when the company started, and it was the vision of founders before they even started Google," Wojcicki said, noting Googles origin as a library digitization project at Stanford. "This happened to be a time where Google had enough resources to take on such an endeavor." Google earlier this year raised $1.7 billion in one of the years most closely watch initial public offerings. Google is classifying books into three categories to deal with copyright issues. For works in the public domain, Google plans to make the full text available as part of search results. For those under copyright, Google will work with publishers to determine how much of the text will be shown, Wojcicki said. Where it has no publisher relationship, Google will show short excerpts or only bibliographical information. Google also plans to display its sponsored links alongside the text of books where it has a publisher relationship and to share with publishers a portion of pay-per-click revenues, Wojcicki said. With public-domain works and the excerpts, no ads will be displayed. In a preview page about the library results, Google also is displaying links for buying a book at an online bookseller or for borrowing it from a local library. Next Page: Will proprietary deals become the next trend?



 
 
 
 
Matthew Hicks As an online reporter for eWEEK.com, Matt Hicks covers the fast-changing developments in Internet technologies. His coverage includes the growing field of Web conferencing software and services. With eight years as a business and technology journalist, Matt has gained insight into the market strategies of IT vendors as well as the needs of enterprise IT managers. He joined Ziff Davis in 1999 as a staff writer for the former Strategies section of eWEEK, where he wrote in-depth features about corporate strategies for e-business and enterprise software. In 2002, he moved to the News department at the magazine as a senior writer specializing in coverage of database software and enterprise networking. Later that year Matt started a yearlong fellowship in Washington, DC, after being awarded an American Political Science Association Congressional Fellowship for Journalist. As a fellow, he spent nine months working on policy issues, including technology policy, in for a Member of the U.S. House of Representatives. He rejoined Ziff Davis in August 2003 as a reporter dedicated to online coverage for eWEEK.com. Along with Web conferencing, he follows search engines, Web browsers, speech technology and the Internet domain-naming system.
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...

 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
 
 
Rocket Fuel