Cloudera Impala 1.0 Brings SQL to Hadoop for Real-Time Queries

Cloudera announced the release of Impala 1.0, its SQL-on-Hadoop solution that enables users to do real-time queries of data stored in Hadoop clusters.

Cloudera, a provider of Apache Hadoop solutions for the enterprise, recently announced the general availability of Cloudera Impala, its open-source, interactive SQL query engine for analyzing data stored in Hadoop clusters in real time.

Cloudera claims to have been first to market with a SQL-on-Hadoop offering, releasing Impala to open source as a public beta offering in October 2012. Since that time, the company has worked closely with customers and open-source users, testing and refining the platform in real-world applications to deliver a production-hardened and customer-validated release, designed from the ground up for enterprise workloads, said Mike Olson, CEO of Cloudera.

In an interview with eWEEK, Justin Erickson, senior product manager for Impala at Cloudera, said adoption of the platform has been strong, with more than 40 enterprise customers and open-source users using Impala today, including 37signals, Expedia, Six3 Systems, Stripe and Trion Worlds. With its 1.0 release, Impala extends Cloudera's unified Platform for Big Data, which is designed specifically to bring different computation frameworks and applications to a single pool of data, using a common set of system resources. Cloudera Impala 1.0 can be downloaded here.

"At Ovum, we believe that for Hadoop to cross over to the enterprise, it must become a first class citizen with IT, the business and the data center," said Tony Baer, principal analyst of software and enterprise solutions at market research firm Ovum, in a statement. "A large part of making Hadoop a first-class citizen in the enterprise is making it accessible to the large base of SQL developers and applications that already exist. With Impala, Cloudera has decisively planted the stake in bringing the worlds of Hadoop and enterprise SQL together. And it has done so in a way that addresses the expectations for performance that are taken for granted in the enterprise SQL world."

"Cloudera's Impala is perhaps the most widely known SQL-on-Hadoop solution," said Joseph Turian, Ph.D., and research analyst at GigaOm Research. "Cloudera has chosen to build its system from the ground up. This will allow it to optimize every part of the solution. It believes that by avoiding legacy, it can actually make a better architecture that is superior, both for end users and the ops staff."

Olson said Cloudera invested more than two years of intensive research and development to build Impala from the ground up, delivering a massively parallel processing (MPP) query engine that is native to Hadoop.

"Impala represents a major advance for Cloudera and the Hadoop ecosystem as a whole," Olson said in a statement. "We've invested years of research and development and devoted a team comprised of the world's top engineering talent to execute it. We are immensely proud to be releasing a fully tested and production-hardened Impala to general availability, and to be shattering industry forecasts for its delivery timetable.

"Cloudera was first to recognize that Apache Hadoop would be a catalyst for business transformation in the 21st century," he continued. "We have worked tirelessly to support the rapid development of the platform to form a viable and open enterprise solution, with a rich and vibrant ecosystem to support it. We will continue to be a primary driver behind the evolution of a 100-percent open source Hadoop platform by setting a high bar that pushes the boundaries of what's possible to exceed the high expectations of our enterprise customers."