The Web has improved in many ways: Web pages and applications do a much better job now of handling live data; rich applications and streaming media are now of much higher quality; and modern browsers such as Mozilla and Opera provide users with a significant amount of control over their browsing experience.
However, there is one thing about the Web that remains poor: site search capabilities.
The search capabilities on most company and content-oriented Web sites are as bad now as they were several years ago. In fact, eWEEK Labs was dismayed to find that we could have easily rerun an article we wrote back in June 1997 on how to improve site searches. Every problem we cited is still in evidence today, and every recommendation we made would still be well-taken. Sites should be using indexes, prebuilt searches and good metatags, for example, but surprisingly few do.
Part of the problem is that search technology itself has changed very little in the past few years. Most improvements in search engines during that time have focused on faster indexing and processing, rather than on changes in the underlying technology. In addition, the general perception that "search stinks" has most likely led to apathy when it comes to improving site searches.
As we said in the 1997 article, if visitors or customers cant find what they want on your site, they will often simply leave. And why take this risk when the steps needed to improve site searches are often very simple?
One of the most effective ways to improve searches is to provide indexes and links to the content you know people are looking for. And you do know what people are looking for: Pretty much any search engine will provide reports on what site visitors are searching on, and this data makes it relatively simple to provide links to commonly searched items directly from your search page.
In addition, while auto- categorization tools such as Vivisimo Inc.s Clustering Engine 4.0 can help with large, diverse collections of content, many sites deal with a relatively small and focused amount of content. For sites such as these, a good index can be built in a matter of days.
While search capabilities have not improved by leaps and bounds in the past few years, there have been some advances. One of the bigger developments is the increased use of XML data in content. XML provides structure to content, so it can make for the kind of effective search results that one typically expects from databases.
In addition, many enterprise products, such as portals and content management systems, add XML to content as part of the content creation process. This makes it a lot easier to leverage information from a wide array of sources during a search. At this time, however, few sites seem to be exploiting this possibility.
Weve been waiting for a while now for one technology to greatly improve searches: RDF (Resource Description Framework). The key technology in the ongoing Semantic Web project, RDF makes it possible to add descriptive metadata to Web content. Doing so allows search tools to understand not only the words in content but also the context, meaning and relationships among the content.
While RDF has great potential, its adoption has been slow. However, another potential aid to helping users find information is a technology based on RDF: RSS (RDF Site Summary).
RSS feeds are typically used by blogs and news sites to provide channels of information for syndication or to which users can subscribe to receive regular updates of information. However, RSS could also be used on sites to create channels for commonly searched categories of content. Users could then subscribe to or occasionally open these channels to get updates of information changes on a site.