Publishers Dangle New Content Controls for Search

Publishers create a new content access control framework to help search engines and Web sites meet copyright rules.

Publishers are attempting to bringing copyright control back into their own hands, with the launch Nov. 29 of a framework that will give them more power to dictate what content search engines such as Google, Yahoo and MSN may make available online.

The ACAP (Automated Content Access Protocol) framework, launched at The Associated Press headquarters in New York after a 12-month pilot program, is designed to help Web sites comply with publishers content use policies.

ACAP, whose supporters include AP, Reuters and the Association of American Publishers, said the first iteration of the framework will allow any publisher to let search engine robots or spiders know that certain pages, directories or sites must not be indexed.

ACAP is an attempt to improve upon robots.txt, a voluntary measure for search engines that publishers claim is not sophisticated enough for todays content and publishing models because it provides only a choice between allowing or disallowing the indexing of content.

In the 13 years that robots.txt has been used, search engines have built in proprietary extensions, which are often not recognized by rival search engines. This means robots.txt does not enforce publishers content rules for all searches, ACAP said on its Web site.

ACAP, which is also voluntary, comes amid high-profile disagreements between search engines and publishers; in particular, Google drew the ire of the AP and Agence France-Presse for posting news summaries, headlines and photos without permission. Google later settled a lawsuit with AFP and paid the AP to head off legal action.

Google faced similar heat from publishers two years ago over its Google Print Library project, which scanned book collections at several major universities and libraries and put the content online.


Read more here about publishers gunning for Google.

Ideally, ACAP, which will use the same robots.txt file that search engines now recognize, would head off such dogfights by providing a standard way for content owners to decide what gets accessed.

While Paris-based search engine Exalead has embraced the voluntary ACAP, major search engines are not yet on board. Microsoft did not respond to a request for comment on ACAP, but Yahoo and Google did.


Click here to read about Exaleads new search engine for the iPhone.

"While Yahoo respects the efforts of ACAP, we have not thoroughly evaluated the initiative and are not members or committed to it," a Yahoo spokesperson told eWEEK.

"As a part of this effort, we have had informal discussions with the ACAP group," a Google spokesperson told eWEEK, but the spokesperson would not say whether Google has tested the protocol.

Search expert Danny Sullivan said on Search Engine Land that the lack of support by the top search providers is one reason why site owners neednt rush to adopt ACAP.

"If publishers were to use ACAP without ensuring that standard robots.txt or meta robots commands were also included, they would fail to properly block search engines," Sullivan wrote.


Check out eWEEK.coms for the latest news, views and analysis on enterprise search technology.