Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Subscribe
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Subscribe
    Home IT Management
    • IT Management
    • Storage

    How to Improve Enterprise Search

    Written by

    eweekdev
    Published January 10, 2008
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Many companies believe that they’ll be able to implement enterprise search platforms without putting much work into it. But, according to Yves Schabes, president of Teragram, having an effective enterprise search system requires an initial investment in time, which will pay in dividends once your content management system is integrated and automated within your enterprise search system.

      As consumers, all we see when surfing the Web is the final layer of search, so we assume that enterprise search should just be that easy.

      However, the effectiveness of internet search ranking relies heavily on the availability of naturally occurring metadata, which is generated through Internet hyperlinking. Each time someone links text to a Web page, the linked text is interpreted by Internet search engines as metadata about this particular page, thus impacting a page’s ranking on the Web search results.

      Contrary to the Internet, there are no textual links between documents in an enterprise, and no implicitly created metadata that a search engine can use.

      The success of an enterprise search deployment relies heavily on the automatic creation of metadata. In order to achieve accurate page ranking in enterprise search, the following three things must occur: The assignment of metatags to content, the creation of taxonomies and the occasional checks and balances from the IT professional or information architect.

      1) <i>Install a system to automate the creation of metadata for existing content, and new content as it’s added to the server.</i> When a set of metadata tags has been defined, newly created documents go though the step of the automatic creation of the metadata. A metadata generation server program can be accessed through various programming interfaces such as JAVA API or SOAP APIs.

      Several kinds of metadata can be generated automatically through the use of metadata management tools (i.e. people, locations, dates, product types, relationships between entities, etc.). The metadata generation server API will pull documents from the document source (disk share, document management system or content management system) and produce the metadata automatically for each document. The document metadata can be stored either in a metadata repository (database associating a document identifier to the metadata), alongside the document in a document management system, or within a structured document (i.e.,

      2) Categorize information into logical groups based on folksonomies, taxonomies and ontologies. Automatic categorization software can help with this process by “reading” the metatags assigned in step one and grouping words, phrases, entities and events into proper bins, depending on relationships.

      The software can then extract keywords, entities and concepts automatically from documents, and instantaneously associate topics with each document. The ability to cross-reference documents in enterprise search is invaluable, especially when attempting to find related documents. Index these topics and categories with associated documents to create an efficient data structure that allows for fast retrieval. By giving users an alternative, faceted search (taxonomy) interface to supplement the standard keyword search they’re used to, you’re more apt to achieve greater recall.

      3) Define what you want to understand from the documents in advance, and check to see that automated systems coincide with these goals. For example, a pharmaceutical company needs to define types of drugs, symptoms, etc, while a financial services company needs to define quarterly earnings statements, market capitalization, stock tickers, etc.

      As terms within an organization evolve, and new terms enter a company’s vernacular, automatic metadata generation systems and taxonomies may need to be re-evaluated. Terms may need to be added, and new cross-references assigned. Perform a systematic human check of your automated search and content management tools at least once per quarter.

      Steps one and two are the most time-consuming at the startup of the search project, but will not likely have to be revisited unless the system is achieving poor recall. Automatic categorization, metadata generation and taxonomy management will allow you to build a semantic search system in your organization, where relevant documents replace nebulous keywords.

      Dr. Yves Schabes co-founded multilingual natural language technology company Teragram Corporation with Dr. Emmanuel Roche in 1997. Dr. Schabes has spent the past fifteen years working on issues relating to natural language processing and computer science. He is the author, or editor, of more than fifty international scientific publications, including co-editor, with Emmanuel Roche, of Finite-State Language Processing (1997, MIT Press, Cambridge MA).

      Dr. Schabes also is an Associate to the Division of Applied Science, Harvard University, Cambridge MA. Prior to founding Teragram, Dr. Schabes was a Senior Scientist at Mitsubishi Electric Research Laboratories in Cambridge, MA. He received a Ph.D in 1990 in Computer Science from University of Pennsylvania, Philadelphia, PA and a Master of Science in Electrical Engineering from l’Ecole Sup??«rieure D’Electricit??« (France) in 1985.

      eweekdev
      eweekdev
      https://www.eweek.com

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.