Yahoo's CTO acknowledges that more improvements to Hadoop's core IT need to be made before it moves into the rarified air of becoming a de facto industry standard.
SANTA CLARA, Calif. - Apache Hadoop, the data analytics science
project that has been ensconced in Yahoo's nurturing cocoon for the last
half-dozen years, has grown up and broken through its shell into the
From a business standpoint, Hadoop officially made its break June 29
from Yahoo to be shepherded by a new venture capital-funded company called
Hortonworks, named after the Dr. Seuss elephant character.
Apache Hadoop, open-source software built in Java that works with
distributed data-intensive applications, enables applications to scale
securely in order to handle thousands of nodes and petabytes of data. A
number of companies now are using Hadoop daily to predict
business patterns, find tendencies in scientific data and predict the
weather, among many other functions.
More and more businesses are finding out that they need to analyze their
stored data and use those metrics to help them make better business
decisions. Hadoop has certainly created the most buzz among
new-generation big-data analytics packages.
Hadoop has gone from creator Doug Cutting's science project to
mainstream business in a short time. Hortonworks was created during the
last four months as an independent, privately held, VC-funded company to
lead the Hadoop community and market the open-source product into the
future. Officially, Mothership Yahoo is now one of its customers.
Hortonworks is an appropriate name for the new company because it is
congruent with Hadoop itself-which is named after the stuffed toy
elephant that belongs to Cutting's young son.
Hortonworks Not a 'Spinout'
Yahoo CTO Raymie Stata (pictured)
, a key figure in all of this,
is responsible for all IT development at Yahoo. Even though Hadoop has
ventured out of the Yahoo development garage to a new home, Stata told
that Yahoo doesn't consider the new company a "spinout."
"Yahoo will have more people within Yahoo working on Hadoop and related
technologies than there will be at Hortonworks," Stata said. "We see
this as increasing the investment that's being made in Hadoop.
"Certainly, we're taking some of our key talent and using it to seed the
new company, so in that regard there are some employees who will be
moving from Yahoo to the new company. But this is not downsizing, it's
not a spinout-it's increasing the investment in Hadoop. Yahoo will
continue to be a major contributor into all aspects of Hadoop going
As far as the breakout is concerned, Stata said, Yahoo has always had a
vision of Hadoop becoming the industry standard in big-data analytics
software but also knew it would one day have to establish its own
"Because of the nature of our [Yahoo's] business, we were kind of living
in the future, so to speak. We could sort of see what everybody else
would need at some point in the future," Stata said, "so we were sort of
forced to build it. But it's what we do on Hadoop that ultimately
creates value for our shareholders.
"So if Hadoop becomes the de facto industry standard for big-data
processing, that's goodness for us, and that's been our mission here in
being so open in the development of Hadoop. We're getting to the last
mile on that; it's a stretch to say that it is a de facto industry
standard at this point. If it fails, it's kind of bad on the community.
It's all set up to get to that stature," he said.
Ongoing investment in the core of Hadoop is needed, Stata said. There
are improvements that need to be made for the IT to become a
standard, he said.
"It is necessary to have a company that is going to take that last mile
as its core business-one that is not Yahoo's core business," Stata
said. "Having Hortonworks out there with a zeal and a mission to see
that last mile can take it."
Is There a 'Tether' to Yahoo?
Because of its familial relationship, will Hortonworks remained tethered to Yahoo, so to speak?
"Hortonworks is independent, but I don't know [about being tethered],"
Stata said. "The valley's a small valley. You can look at almost any
company in Silicon Valley and find a core of Yahoo alums there. We're
"Relationships are valuable. Yahoo is an investor, but we're a minority
investor in an independent company. We are a development partner
committed to continuing to develop that core tech. Because of the
relationships, we have the ability to work very deeply in terms of
driving that tech forward."
One of the reasons for creating the new company, Stata said, is that
Yahoo already has seen what the future holds for enterprise analytics
(thanks to its six-year-long Hadoop development stage) and knows what
will work. It saw that the big-data analytics need would soon become so
widespread that a dedicated company would be necessary to focus solely
on it-and not the advertising and Web services businesses that are
Yahoo's meal ticket.
"We have been running a truly enterprise deployment of Hadoop, and I
don't think anybody does that. It's a departmental solution today,"
Stata said. "But it's not going to be six years before other people are
doing enterprise [analytics as Yahoo does]. That gap between Yahoo and
the rest of the user base is shrinking.
"On one hand, it's great to have an independent company that can have
this relationship with Yahoo and see pain points that are on the road
for a couple of years ahead. We now need to look at other customers and
bring that input in and to synthesize it with Yahoo's more futuristic
view. Obviously, an independent company with a commercial mandate is
going to do it a lot better than an open-source team inside Yahoo."