Cloudera Impala 1.0 Brings SQL to Hadoop for Real-Time Queries

By Darryl K. Taft  |  Posted 2013-05-02 Print this article Print

Erickson noted that Cloudera Impala has been widely embraced by Cloudera's partner ecosystem, with numerous companies certifying their solutions for integration with the platform, including Alteryx, Capgemini, IBM Cognos, Karmasphere, MicroStrategy, Pentaho, QlikView, SAP, Splunk and Tableau.

"Our successful collaboration with Cloudera empowers organizations to unlock valuable business insights hidden in large, complex data sets in compelling new ways," said Paul Zolfaghari, president at MicroStrategy, in a statement.

"We are very excited about Cloudera's continuing innovation in the SQL-on-Hadoop market. In our independent testing of Cloudera Impala, we experienced a massive performance increase in the accessibility of data stored in Hadoop, said Zolfaghari. "Through our platform integration with Impala, customers can now perform sophisticated point and click analytics on data stored in Hadoop directly from MicroStrategy applications."

Another partner, Tableau Software, has noticed great improvements in query performance when using Impala, making Hadoop "more valuable to our customers as they adopt it broadly and give more people interactive access," said Dan Jewett, vice president of product management at Tableau, in a statement.

Impala 1.0 offers significant performance improvements over MapReduce/Hive for a wide range of business intelligence (BI) and analytic queries, making BI over Hadoop feasible, Erickson said.

Kornacker and Erickson's post mentioned the following Impala 1.0 features:

  • Support for a subset of ANSI-92 SQL (compatible with Hive SQL), including CREATE, ALTER, SELECT, INSERT, JOIN and subqueries

  • Support for partitioned joins, fully distributed aggregations and fully distributed top-n queries

  • Support for a variety of data formats: Hadoop native (Apache Avro, SequenceFile, RCFile with Snappy, GZIP, BZIP or uncompressed); text (uncompressed or LZO-compressed); and Parquet (Snappy or uncompressed), the new state-of-the-art columnar storage format

  • Support for all CDH4 64-bit packages: RHEL 6.2/5.7, Ubuntu, Debian, SLES

  • Connectivity via JDBC, ODBC, Hue GUI or command-line shell

  • Kerberos authentication and MR/Impala resource isolation

Cloudera Impala is an Apache-licensed open-source project. The platform is open to community contributions, and the source code is available for free download on GitHub. For more information, or to join Cloudera and other open-source contributors in the development of the Impala platform, visit:



Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel