Apache MetaModel, Drill Graduate From Incubator

The Apache Software Foundation graduated its MetaModel and Drill projects from the organization's incubator to become Top-Level Projects.

big data outlook

The Apache Software Foundation (ASF) has made its Apache MetaModel data access framework a Top-Level Project (TLP) at the organization.

Apache MetaModel graduated from the Apache Incubator to become a TLP, signifying a level of maturity of the project. MetaModel gets its name from being a model for interacting with data based on metadata, enabling developers to go above the physical data layer and apply their application to just about any data.

Apache MetaModel is a data access framework that provides a common interface for the discovery, exploration, and querying of different types of data sources. Unlike traditional mapping frameworks, MetaModel emphasizes metadata of the data source itself and the ability to add more data sources at runtime. MetaModel's schema model and SQL-like query API is applicable to databases, CSV files, Excel spreadsheets, NoSQL databases, Cloud-based business applications, and even regular Java objects. This level of abstraction makes MetaModel great for dynamic data processing applications, less so for applications modeled strictly around a particular domain, ASF officials said.

"MetaModel enables you to consolidate code and consolidate data a lot quicker than any other library out there," said Kasper Sorensen, vice president of Apache MetaModel, in a statement. "In these 'big data days' there's a lot of focus on performance and scalability, and surely these topics also surround Apache MetaModel. The big data challenge is not always about massive loads of data, but instead massive variation and feeding a lot of different sources into a single application. Now to make such an application you both need a lot of connectivity capabilities and a lot of modeling flexibility. Those are the two aspects where Apache MetaModel shines. We make it possible for you to build applications that retain the complexity of your data – even if that complexity may change over time. The trick to achieve this is to model on the metadata and not on your assumptions."

David Morales, a big data architect at Stratio, said Apache MetaModel is a key technology in Stratio Datavis, allowing Stratio developers to manage metadata and create SQL-based connectors for a variety of data stores. "Thanks to Apache MetaModel, Datavis users can create beautiful dashboards using their SQL skills, instead of knowing several query languages,” Morales said.

Ankit Kumar, technical lead at Human Inference and a member of the Apache MetaModel Project Management Committee, said Apache MetaModel is the core technology used underneath Human Inference’s Master Data Management (MDM) offering and it provides an abstraction layer above the different database schemes the company supports, including Postgres, DB2, Oracle, SQL Server, and ElasticSearch.

The ASF also recently announced that Apache Drill has graduated from the Apache Incubator to become a TLP. Apache Drill is a schema-free SQL query engine that delivers real-time insights by removing the constraint of building and maintaining schemas before data can be analyzed. Drill users can run interactive ANSI SQL queries on complex or constantly evolving data including JSON, Parquet, and HBase without worrying about schema definitions. As a result, Drill enables rapid application development on Apache Hadoop and also allows enterprise BI analysts to access Hadoop in a self-service fashion.

"Apache Drill's graduation is a testament to the maturity of the technology and a strong indicator of the active community that develops and supports it," said Jacques Nadeau, vice president of Apache Drill, in a statement. "Drill's vibrant community ensures that it will continue to evolve to meet the demands of self-service data exploration use cases."

While providing faster time to value from data stored in Hadoop, Drill also reduces the burden on IT developers and administrators who prepare and maintain datasets for analysis, ASF said. Analysts can explore data in real-time, pull in new datasets on the fly, and also use traditional BI tools to visualize the data easily.

Inspired by Google's Dremel -- an academic paper on interactive analysis of Web-scale datasets -- and a vision to support modern big data applications, Drill entered the Apache Incubator in August 2012. The project currently has code contributions from individual committers representing MapR, Hortonworks, Pentaho, and Cisco, among others.

"We see the Apache Top-Level Project status as a major milestone for Drill. With a growing user base and diverse community interest, we are excited that Drill will indeed be a game changer for Hadoop application developers and BI analysts alike," said Tomer Shiran, member of the Apache Drill Project Management Committee, in a statement.