How to Improve the Efficiency of Enterprise Search

 
 
By Yves Schabes  |  Posted 2009-07-14 Email Print this article Print
 
 
 
 
 
 
 

There is an efficiency gap between enterprise and Internet search today. Enterprise users are used to Googling queries and getting results quickly and accurately, but while searching at work, these same workers often find it difficult to find internal documents with the same speed and efficiency. For better search and retrieval, Knowledge Center contributor Yves Schabes explains why enterprises with large amounts of data should invest in an enterprise search solution that automatically tags and categorizes enterprise content.

Searching for a document in the workplace usually involves sifting through multiple pages of search results, which wastes time and money. And because enterprise users are searching for specific information-not just the most popular answer-they expect more precise search results than they would get from the Internet. Simply put, the techniques that work well on the Web aren't as well-suited to enterprise search.

A recent study noted that "49 percent of survey respondents agreed or strongly agreed that it is a difficult and time-consuming process to find the information they need to do their job." All search, whether on the Internet or in an enterprise, is powered by metadata-the data about data.  Metadata is traditionally known as being the recorded information describing the different parts of data such as names, sizes, lengths, etc.

The reason why finding information on the Internet is faster and more accurate than on an enterprise site is because hyperlinks provide the Internet with naturally occurring, high-quality metadata. Metadata is generated through Internet hyperlinking. Each time someone links text to a Web page, the linked text is interpreted by Internet search engines as metadata about this particular page, thus impacting a page's ranking on the Web search results.

However, searching on an enterprise site is a more difficult and laborious process because of the lack of metadata. Unlike the Internet, there are no textual links between documents in an enterprise, and no implicitly created metadata that a search engine can use. Office documents are not naturally linked together, and there are too few corporate librarians assigned to manually tag each office document with the appropriate metadata.

Also, an average enterprise has more content types, formats and security measures than the Internet, which makes this search more time-consuming. The result is that workers spend too much time sorting through pages and pages of irrelevant results, as opposed to executing the tasks associated with their job.

The bottom line for enterprises with large amounts of data is that it's worth investing in the right search technology. The solution should automatically categorize enterprise content, therefore eliminating time-intensive, manual categorization. In an organization with hundreds, even thousands of workers outputting knowledge, it could take years to tag each employee's electronic documents by hand.




 
 
 
 
Dr. Yves Schabes is President of Teragram. Yves co-founded Teragram with Dr. Emmanuel Roche in 1997. Yves has spent the past fifteen years working on issues relating to natural language processing and computer science. Yves is the author, or editor, of more than fifty international scientific publications, including co–editor, with Emmanuel Roche, of Finite-State Language Processing (1997, MIT Press, Cambridge MA). Yves is also an Associate to the Division of Applied Science at Harvard University. Prior to founding Teragram, Yves was a Senior Scientist at Mitsubishi Electric Research Laboratories in Cambridge, MA. He also held a position as a Research Associate at the University of Pennsylvania. Yves has been a program committee member of many international scientific conferences and journals. He received a Ph.D in 1990 in Computer Science from University of Pennsylvania and a M.S. in Electrical Engineering from l'Ecole Supérieure D'Electricité (France) in 1985. He can be reached at http://www.teragram.com/cgi-bin/contactys.pl.
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...
 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
 
 
Close
Thanks for your registration, follow us on our social networks to keep up-to-date
Rocket Fuel