Kognitio Cloud Delivers Affordable Analytics Platform on Amazon EC2

 
 
By Jeff Cogswell  |  Posted 2013-01-01
 
 
 

Kognitio Cloud Delivers Affordable Analytics Platform on Amazon EC2


Kognitio has managed to create an affordable cloud-based analytical platform that customers can deploy within a preconfigured Amazon Elastic Compute Cloud (EC2) image that contains everything they need to get started without having to run post-installation shell scripts.

Kognitio Cloud is a specially built version of the Kognitio Analytical Platform that runs as a node within Amazon EC2 and is capable of handling the massive servers available in that environment. It is built from the ground up—to scale both vertically on a single server with increases in RAM as well as horizontally across multiple server instances in a cloud.

In the next few years, we're going to see a change in the way businesses do their online data processing. So-called business intelligence, and in particular the online analytical processing (OLAP), are moving to cloud technology to take advantage of the scalability offered by such cloud vendors as Amazon Web Services.

But in order to handle true scalability, software must be designed to support growth. While existing software can sometimes be re-engineered to scale in a system such as Amazon EC2, the results are likely to be clunky and require significant work on behalf of the IT team trying to implement the solution.

I know because I tried to get a SAP installation up and running on a cloud server and it was a disaster. So you can imagine my delight when I installed Kognitio on an Amazon EC2 server and it just worked. I didn't have to do anything. And better, it worked perfectly in conjunction with the Amazon EC2 environment.

Here's why it worked so well: For starters, unlike traditional OLAP tools, Kognitio operates in-memory. In the past, this would have been a ridiculous idea when servers were limited to maybe a gigabyte or two of RAM. But today, computers with 8GB of RAM are commonplace. (The notebook on which I'm typing this review has 8GB of RAM.)

But with cloud vendors where you can allocate your servers on the fly, you can build even bigger servers. Even 16GB of RAM isn’t a problem.

Amazon lets you go as high as 68GB. (I'm not sure why 68, as opposed to the 2-power-friendly 64GB.) So now, suddenly, in-memory OLAP is a reality when there's more than enough memory to handle most data situations as well as the processing of the data.

Further, Kognitio is written mostly in C, with some parts even in assembler. It makes use of advanced x86 CPU features such as vectorization and parallelization. In one sense, coding it in C and assembler might seem like a step backward, as today we have some cool, powerful languages.

But the fact is the cool, powerful languages don't offer nearly the vectorized performance and parallelization across cores you can get out of C and assembler using the native Intel instructions. (However, that is changing. High-level languages are starting to implement multi-core parallel processing and vectorization.) Finally, Kognitio has native support for Amazon EC2's scaling.

Now we’ve reached the part that makes me the happiest: They're directly supporting the modern, scalable approach to writing software. In addition to supporting a large server vertically by increasing the amount of RAM, Kognitio supports distributing multiple nodes of Kognitio horizontally across multiple instances on Amazon EC2. And for the final package, the developers at Kognitio went through great lengths to build an Amazon EC2 image that contains everything you need, already configured for you. No post-installation shell scripts to run. It's already there.

If you're not familiar with Amazon EC2 images, here's a quick summary. An image is essentially a snapshot of a running operating system and hard drive. Once you have the image, you can use it to create additional servers identical to the first one. Software that was already installed will then be available on the new servers, automatically.

 

Kognitio Cloud Delivers Affordable Analytics Platform on Amazon EC2


That's what Kognitio has with its analytics platform. The company created a server on Amazon, installed the software, got it running, tuned it and then took a snapshot of it. So when you install the image, you get an exact copy of that system they built. But the cool thing is that when you install the image, you still get to choose how big a server you want in terms of RAM and other physical features. You're not stuck with the same configuration the image was built with.

The idea then is that as you need additional horsepower, you can simply allocate more servers on Amazon EC2 without the usual capital expenditure of purchasing the hardware yourself. When you're done with them, you shut them down and even delete them.

Because of the hourly pricing strategy, you can allocate a huge amount of servers for a small amount of money. Some medium-sized servers run for about a dollar per hour. Some of the more expensive ones are almost $4 an hour. So if you run, say, three of the higher-powered servers, you're talking $12 per hour. Over a year, that can be costly, but if you only need them for a day or two, it won’t bust your budget.

For my tests, I allocated a server on Amazon EC2 with—no kidding—32GB of RAM. The processor in this case is a quad-core. Kognitio offers a couple of different images, one pre-populated with demo data, and one empty; I chose the pre-populated one so that I wouldn't have to generate my own data.

After creating the image, I simply started it up. There was no further work needed on that end. The next step was to install the Windows-based client tools on my own computer. The client computer is where I do my queries. The queries are sent via open database connectivity (ODBC) to the server; the server then does the actual heavy-duty processing.

I mentioned that Kognitio operates in-memory. But it saves the data, of course, so that when you shut down the system, your data will still be there when you reboot. Then it quickly reads the data back into memory. Kognitio can read in 650 million rows of data per second per server.

The client tools included the necessary ODBC driver for Windows to connect to the Kognitio server that's running on Amazon EC2. It also includes a console program and command-line tools. There are both 32-bit and 64-bit versions of the tools.

The console is a full-featured administration tool for managing the database, including an SQL panel for running manual SQL queries and tools for managing users and security. Since the data on the server is in-memory, there are also tools for managing the RAM usage and even for printing reports. There's also a full scripting language for writing scripts that execute SQL statements. The scripting language includes variables, loops, if statements and statements for including other script files.

The console proved very easy to use as it works much like similar tools (for example, Microsoft's SQL Management Studio or the popular Toad for Oracle). I was able to easily execute queries, the results of which came back nearly instantly.

Note that Kognitio says they're going to have another demo version ready on Amazon that can spin up 24 nodes and handle 1.5 terabytes of data.

Kognitio might not be the biggest player in the field next to such giants as Oracle, but rest assured, as word spreads about the company, it's going to become one of the biggest players. Finally, somebody is doing it right.

Rocket Fuel