Hadoops Stocks in Trade
Apache Hadoop, open-source software, has proved to be the data prospector with the most market traction in the last five years. Originally created by current Cloudera Architect and Apache Foundation Chairman Doug Cutting while he worked at Yahoo, Hadoop got its name from a stuffed elephant (anappropriate image for so-called big data) belonging to Cutting's son. Hadoop processes large caches of data by breaking them into smaller, more accessible batches and distributing them to multiple servers to analyze. (Agility is a vital attribute: It's like cutting your food into smaller pieces for easier consumption.) Hadoop then processes queries and delivers the requested results in far less time than old-school analytics software-most often minutes instead of hours or days.
"The analysts at Gartner and IDC have described big data as being about the volume, velocity and variety of data, and those are the things that draw people to Hadoop as a system," said Cloudera Vice-President of Products Charles Zedlewski.
Giving Away the Code
Why give away the code? Because when Cutting and Yahoo developed, tested and ran the base code inhouse, they learned how complicated it is to use. They immediately saw that the money-earning future of the software would come from surrounding services: an intuitive user interface, customized deployments and additional features. In March 2009, startup Cloudera was the first independent company to take the open-source code and productize the Hadoop analytics engine with its CDH (Cloudera's Distribution, including Apache Hadoop) and Cloudera Enterprise. An impressive group of investors and advisors teamed up to launch the company, including VMware founder and former CEO Diane Greene, Flickr co-founder Caterina Fake, former MySQL CEO Marten Mickos, LinkedIn President Jeff Weiner and Facebook CFO Gideon Yu. Since Cloudera's debut, a handful of top-tier companies and startups have crafted their own versions of Hadoop based on the freely available open-source architecture. This is truly a new-generation enterprise IT competition. It's similar to a relay race in that all the contestants have the same type of baton (Hadoop code) and have to compete based strictly on their own speed, agility and creativity. Currently, the race is on among a new set of competitors attempting to market big data analytics to the most enterprises in the most effective way.