By Jim Rapoza  |  Posted 2004-01-26 Print this article Print

Clustering engine runs on Linux and Windows servers, and we found that installation on both was simple.

Clustering Engine now features a Web-based administration tool, and it includes helpful tutorials as well as hundreds of predefined search sources.

However, while the Web-based administration tool was appreciated, it wasnt user- friendly. In many ways, it felt like a browser-based substitute for manually editing XML files. (In fact, users can choose to do exactly that instead of using the Web-based interface.) Still, it does centralize the management of the application and helped with certain tasks such as testing content sources that we created.

In general, administration of the application itself could be extremely complex and time-consuming. However, this is because pretty much every aspect of the product is exposed and can be customized in almost any way possible. Once mastered, Clustering Engine can be managed in a way that best suits a company and its customers.

The Web-based administration tool does provide access to the tutorials and documentation, which is a must for connecting Clustering Engine to your content sources and for integrating it with your applications.

We ran into trouble when we tried to quickly run through the documentation and add content. When we went through everything step by step, however, we were able to connect to any kind of search form and retrieve results from multiple sources.

Creating a source from our own search engines or from external search sites that we wished to leverage ranged in ease of use from very simple to highly complex.

On the simple side of the scale, if our search site used a standard HTTP GET protocol, we could simply copy the URL from our browser window to get most of the parameters. On the complex side, where a search engine used customized, under-the-covers scripting, getting all the right information meant digging through the code of the search application. In those cases, the many sample sources included with Clustering Engine proved invaluable.

An extremely useful feature when setting up search sources is the ability to create knowledge bases. The knowledge bases essentially made it possible to tweak categorization to avoid common words or phrases that could skew results (for example, removing "eWEEK" as a relevant phrase on searches of eWEEK-only content). Using the knowledge base feature, we could also define things such as acronyms and synonyms to make sure all related content was properly categorized.

Clustering Engine consists mainly of XML files and Common Gateway Interface scripts, so it is possible to easily integrate it into almost any system or application. Vivisimo includes excellent API information for integrating Clustering Engine with a variety of programming languages and systems. Vivisimo also includes detailed information on the XML input and output of the product, which allows for some very advanced customizations.

Labs Director Jim Rapoza can be reached at jim_rapoza@ziffdavis.com.

Jim Rapoza, Chief Technology Analyst, eWEEK.For nearly fifteen years, Jim Rapoza has evaluated products and technologies in almost every technology category for eWEEK. Mr RapozaÔÇÖs current technology focus is on all categories of emerging information technology though he continues to focus on core technology areas that include: content management systems, portal applications, Web publishing tools and security. Mr. Rapoza has coordinated several evaluations at enterprise organizations, including USA Today and The Prudential, to measure the capability of products and services under real-world conditions and against real-world criteria. Jim Rapoza's award-winning weekly column, Tech Directions, delves into all areas of technologies and the challenges of managing and deploying technology today.

Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel