WASHINGTON-By all accounts the E-Government Act passed by Congress five years ago is a success. Last year, the governments USA.gov site received almost 100 million visitors and was hailed by Time magazine as one of the "25 Sites We Cant Live Without."
Nevertheless, approximately 2,000 government sites harbor public information unavailable through search engines like Google, Yahoo and MSN.
"We have found that many government agencies structure their Web sites in ways that prevent search engines from including their information in search results, often inadvertently," John Needham, Googles manager of public sector content partnerships, told the Senate Committee on Homeland Security and Government Affairs Dec. 11.
Noting that Googles research shows that as many as four out of every five Internet users reach federal government sites through commercial search engines, Needham added, "If the information on a particular public Web site is not part of the index underlying a search engine, citizens are bound to miss out on information or services that the agency offers."
Needham said the most common inadvertent structural barrier of many government sites involve dynamic databases using search forms that require users to input several fields of information to conduct a search.
"Our crawlers cannot effectively follow the links to reach behind the search form," Needham said. Other barriers, he said, include robots.txt files and outdated and inaccessible links.
Click here to read about proposed cuts in local emergency management grants for communications technology.
Needham urged government sites to implement Sitemap protocol, which provides a mechanism for site operators to produce a list of all pages on a site and systematically communicate the list to search engines. Last year, Google, Microsoft and Yahoo announced their joint support for the standard.
"What this means is that, in implementing Sitemaps, a government agency can be sure its better serving the American people, no matter which search engine individual citizens are using," Needham said.
The hearing came one day after two public watchdog groups, Center for Democracy and Technology and OMB Watch, issued a report that said vital government information appears "invisible" to the millions of Americans using search engines for government data and information.
Ari Schwartz, the CDTs deputy director, told lawmakers attending the hearing that commercial search engines are the most effective and efficient way to find information online.
"Government agencies must recognize that taxpayers will not find the information that is made available unless the information can be found on commercial search engines," Schwartz said. "Two easy ways to ensure that government information is indexed are adopt the Sitemaps protocol and to limit the use of robots.txt files."
Schwartz also said that even the governments own search engine is hamstrung by the lack of Sitemaps protocol since the government uses Microsoft Live Search. "Therefore, it is subject to the exactly the same inability to search these important sites as other commercial search engines."
Sen. Joe Lieberman, I-Conn., the chairman of the panel and one of the original authors of the E-Gov Act, said while the hearing was focused on the executive branchs efforts to implement the initiative, Congress also has a responsibility to more openly share its information with citizens.
To that end, Lieberman joined fellow senators Susan Collins, R-Maine, and John McCain, R-Ariz., to introduce legislation Dec. 10 to require the Congressional Research Service to make its reports public. Currently the reports are not easily available, leading to a small cottage industry that sells the reports.
"Our bill would allow members and committees to easily post all CRS reports on their sites to help keep all their constituents informed," Lieberman said.
Check out eWEEK.coms for the latest news, views and analysis of technologys impact on government and politics.