Hadoop Data Analytics: 10 Reasons Why It's Important for Business

 
 
By Chris Preimesberger  |  Posted 2011-07-01
 
 
 

Yahoo Creates Hortonworks to Lead Hadoop

On June 29, Hortonworks (named after the Dr. Seuss elephant) was created as an independent, privately held, VC-funded company to lead the Hadoop community and market the open-source product into the future. Its parent, Yahoo, is now one of its customers.

Yahoo Creates Hortonworks to Lead Hadoop

Hadoop Is No Longer a Science Experiment

Yahoo has taken Hadoop from creator Doug Cutting's science project to a world-class platform in just five years, contributing more than 70 percent of the code and helping to establish it as the IT industry??ís pre-eminent Big Data platform.

Hadoop Is No Longer a Science Experiment

Hadoop Was a Key Part of IBMs Watson

Hadoop analytics and data discovery abilities were a big reason that IBM's Watson computer was able to win a widely publicized "Jeopardy" showdown last year against a couple of very successful human former champions.

Hadoop Was a Key Part of IBMs Watson

Largest Deployment: 200-Petabyte Data Farm

In the technology's largest deployment (at Yahoo, of course), Hadoop is used daily to analyze more than 200PB of data to make Yahoo more personal and relevant to its users and customers. It works with all aspects of Yahoo's IT system, including search, advertising, user experience and fraud detection.

Largest Deployment: 200-Petabyte Data Farm

Big Software for Big Data

Yahoo's Hadoop system includes more than 42,000 servers, made up of clusters of up to 4,000 machines, allowing it to process over 5 million jobs per month. Fourteen million new files are put into Hadoop every day, so scale is not exactly a problem.

Big Software for Big Data

Hadoop Will Sell Services Around Its Platform

The Hadoop software is freely obtainable as an open-source project, but a set of premium services are being built now around the technology for enterprises that want to get more than just one level of service.

Hadoop Will Sell Services Around Its Platform

Now THATS a Lot of Email

Hadoop protects Yahoo's 289 million mailboxes from spam worldwide. Hadoop also plays a key role in customizing 13 million personal Web pages used each day by Web browsers.

Now THATS a Lot of Email

Used for More Than Just Web Traffic

Hadoop use has evolved beyond Web traffic and scientific research (pictured: CERN Supercollider, Switzerland). It's now in production across search engines, advertising optimization, machine learning and content feeds. It loads 10 terabytes of data per day onto research clusters.

Used for More Than Just Web Traffic

New Companies Quickly Growing Up Around Hadoop

MapR, Zettaset, Cloudera, HStreaming, Hadapt, DataStax, Datameer—a whole new subset of Hadoop-related companies have been funded and are now out of stealth to help bring the best of the new technology to various markets.

New Companies Quickly Growing Up Around Hadoop

Hadoop Knows It Still Needs to Improve

Yahoo and Hortonworks leaders have acknowledged that Hadoop still needs time to mature and become more user-friendly. It is not a simple IT to deploy and use, and the user interface needs some work. But the teams at both Yahoo and Hortonworks are convinced they will have these issues solved in the months to come.

Hadoop Knows It Still Needs to Improve

Rocket Fuel