Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Applications
    • Applications
    • Development
    • Networking
    • Servers

    Pentaho Open-Sources Kettle Big Data Analytics Tools

    Written by

    Darryl K. Taft
    Published January 31, 2012
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      Pentaho announced that it has open-sourced its big data analytics tools, known as the Pentaho Kettle project, under the Apache 2.0 license.

      Pentaho said it has made freely available under open source all of its big data capabilities in the new Pentaho Kettle 4.3 release, and has moved the entire Pentaho Kettle project to the Apache License, version 2.0.

      Because Apache is the license under which Hadoop and several of the leading NoSQL databases are published, this move will further accelerate the rapid adoption of Pentaho Kettle for Big Data by developers, analysts and data scientists as the go-to tool for taking advantage of big data, the company said.

      Big data capabilities available under open-source Pentaho Kettle 4.3 include the ability to input, output, manipulate and report on data using the following Hadoop and NoSQL stores: Cassandra, Hadoop HDFS, Hadoop MapReduce, Hadapt, HBase, Hive, HPCC Systems and MongoDB.

      “In order to obtain broader market adoption of big data technology including Hadoop and NoSQL, Pentaho is open sourcing its data integration product under the free Apache license,” said Matt Casters, founder and chief architect of the Pentaho Kettle Project, in a statement. “This will foster success and productivity for developers, analysts and data scientists giving them one tool for data integration and access to discovery and visualization.”

      With regard to Hadoop, Pentaho Kettle makes available job orchestration steps for Hadoop, Amazon Elastic MapReduce, Pentaho MapReduce, HDFS File Operations and Pig scripts. All major Hadoop distributions are supported, including Amazon Elastic MapReduce, Apache Hadoop, Cloudera’s Distribution including Apache Hadoop (CDH), Cloudera Enterprise, EMC Greenplum HD, HortonWorks Data Platform powered by Apache Hadoop, and MapR’s M3 Free and M5 Edition. Pentaho Kettle can execute ETL transforms outside the Hadoop cluster or within the nodes of the cluster, taking advantage of Hadoop’s distributed processing and reliability.

      Pentaho officials said Pentaho Kettle for Big Data delivers at least a tenfold boost in productivity for developers through visual tools that eliminate the need to write code such as Hadoop MapReduce Java programs, Pig scripts, Hive queries, or NoSQL database queries and scripts.

      In addition, Pentaho officials said Pentaho Kettle also:

      • Makes big data platforms usable for a huge breadth of developers, whereas previously big data platforms were usable only by the geekiest of geeks with deep developer skills such as the ability write Java MapReduce jobs and Pig scripts;
      • Enables easy visual orchestration of big data tasks such as Hadoop MapReduce jobs, Pentaho MapReduce jobs, Pig scripts, Hive queries and HBase queries, as well as traditional IT tasks such as data mart/warehouse loads and operational data extract-transform-load jobs;
      • Leverages the full capabilities of each big data platform through Pentaho Kettle’s native integration with each one, while enabling easy co-existence and migration between big data platforms and traditional relational databases; and
      • Provides an easy on-ramp to the full data discovery and visualization capabilities of Pentaho Business Analytics, including reporting, dashboards, interactive data analysis, data mining and predictive analysis.

      “Pentaho Kettle’s powerful ETL [extraction, transformation and loading capabilities] enables developers and analysts to more quickly integrate MongoDB into their enterprise environments by allowing them to transform and report on data they have stored in MongoDB,” said Erik Frieberg, vice president of marketing and alliances at 10gen, in a statement. “Pentaho Kettle for Big Data is a great addition to the MongoDB ecosystem, and 10gen looks forward to continuing to work with Pentaho to further develop this open-source tool with the MongoDB community.”

      “The Pentaho and Cloudera partnership allows our joint customers to more quickly integrate Hadoop within their enterprise data environments while also providing exceptional analytical capabilities to a wider set of business users,” said Ed Albanese, head of business development at Cloudera, also in a statement. “We applaud Pentaho’s decision to open source its big data capabilities under the Apache License; the technology they are contributing is substantial and is a big step forward in helping to accelerate adoption and make it easier to use Hadoop for data transformation.”

      “Hadapt allows customers to analyze their structured and unstructured data together in a single platform without ever having to move data outside of Hadoop,” said Justin Borgman, CEO of Hadapt, in a statement. “Hadapt’s SQL-compliant query interface together with Pentaho Kettle for ETL allows analysts to leverage their existing SQL skills for big data analytics on Hadoop.”

      Darryl K. Taft
      Darryl K. Taft
      Darryl K. Taft covers the development tools and developer-related issues beat from his office in Baltimore. He has more than 10 years of experience in the business and is always looking for the next scoop. Taft is a member of the Association for Computing Machinery (ACM) and was named 'one of the most active middleware reporters in the world' by The Middleware Co. He also has his own card in the 'Who's Who in Enterprise Java' deck.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×