Texis Categorizer 4

By Jim Rapoza  |  Posted 2002-07-15 Print this article Print

.1"> Texis Categorizer 4.1

Traditionally, categorization applications have come from search engine vendors such as Thunderstone, and these applications still comprise the biggest chunk of the categorization market.

Texis Categorizer is an excellent example of this type of categorization application, with a great deal of flexibility in its implementation and with the ability to easily integrate with other systems, especially Web-based applications.

Texis Categorizer runs on almost anything, from Windows servers to most flavors of Unix. We ran our test system on a Linux box.

The main forces under Texis Categorizer are the Texis SQL database and the Vortex scripting engine, which uses standard CGI (Common Gateway Interface) scripting. Almost any Web developer will be able to jump into this system very quickly.

Once the initial scripts are set up, which includes defining the category taxonomy, much of the remaining work is done in an easy-to-use, browser-based interface.

For each category in the taxonomy, Thunderstone recommends using about 20 training sets. For example, in the eWeek Labs taxonomy, we would use 20 reviews of storage products to train the application on how to categorize content on storage.

However, even if the training proves incomplete, Texis Categorizer makes it easy to fine-tune categories. During tests, for example, we could load uncategorized content into the interface and adjust categories as needed. With each new piece of content, we could see the accuracy of the categorization improve (see screen).

In addition, if we needed to change categorization information, we could uncategorize content that had already been processed by the system, then re-enter it.

The price of Texis Categorizer is well below that of many competing products: $10,000 for the Texis engine and $10,000 for Categorizer.

Other Articles in this eValuation:
  • Data By Design
  • eVal Scorecard: Content Categorization
  • Standards Target Categorization

    Jim Rapoza, Chief Technology Analyst, eWEEK.For nearly fifteen years, Jim Rapoza has evaluated products and technologies in almost every technology category for eWEEK. Mr RapozaÔÇÖs current technology focus is on all categories of emerging information technology though he continues to focus on core technology areas that include: content management systems, portal applications, Web publishing tools and security. Mr. Rapoza has coordinated several evaluations at enterprise organizations, including USA Today and The Prudential, to measure the capability of products and services under real-world conditions and against real-world criteria. Jim Rapoza's award-winning weekly column, Tech Directions, delves into all areas of technologies and the challenges of managing and deploying technology today.

    Submit a Comment

    Loading Comments...
    Manage your Newsletters: Login   Register My Newsletters

    Rocket Fuel