Microsoft announced that HDInsight, its cloud-based distribution of Hadoop, is now generally available after spending over half a year in preview. Quentin Clark, vice president of Microsoft’s Data Platform Group, stated in an Oct. 28 blog post that reaching this stage “is an important milestone for Microsoft, as it’s part of our broader strategy to bring big data to a billion people.”
Describing Hadoop as the “cornerstone of how we will realize value from big data,” Clark said that his company “engineered HDInsight as 100 percent Apache Hadoop offered as an Azure cloud service.” The response “has been tremendous,” he added.
Hadoop is an open-source batch-processing software foundation that has emerged as the dominant big data analytics platform. Its popularity is owed, in part, to its SQL-friendly hooks, its enablement of real-time interactive analytics, an expanding ecosystem of products and a growing number of big-name supporters.
Hadoop can be found toiling away in the data centers that power Amazon Web Services, Apple, Facebook, Netflix and Twitter. On Feb. 26, chipmaker Intel announced a Hadoop distribution (its third) that leverages the company’s expertise in processors, on-chip security and solid-state drives (SSDs).
“Microsoft recognizes Hadoop as a standard and is investing to ensure that it’s an integral part of our enterprise offerings,” said Clark. In collaboration with Hortonworks, an enterprise Apache Hadoop specialist and solutions provider, the company plans to issue an HDInsight update that includes support for Hadoop 2.0, he informed.
HDInsight is already delivering big data insights for some organizations, boasted Clark. The city of Barcelona, Spain, is employing the technology to comb through data on traffic patterns and social media mentions and other sources to drive decision making in transportation, security and spending. And computer scientists at Virginia Tech are using HDInsight to provide “cost-effective access to DNA sequencing tools and resources,” Clark said.
Envisioning a big data-driven future for businesses, Microsoft has also worked to integrate HDInsight into its Office ecosystem.
“We have built it to integrate with Excel and Power BI—our business intelligence offering that is part of Office 365—allowing people to easily connect to data through HDInsight, then refine and do business analytics in a turnkey fashion,” noted Clark. Developers, too, are welcome. HDInsight supports “.NET, Java and more,” he said.
Part of imparting big data’s benefits on a billion people is to make it trivial to spin up Hadoop clusters, relatively speaking. In March, when Microsoft launched the Windows Azure HDInsight preview, Redmond’s resident cloud guru, Scott Guthrie, stated that a cluster takes “a few minutes to create (as part of creating it will configure the necessary Virtual Machines that together make up your Hadoop cluster).”
“Once the cluster is created,” Guthrie added, “you can drill into the dashboard view to see the cluster quick glance screen.” That screen, in turn, allows users to “to see the basic information about your cluster and gives you a simple method to connect to the cluster.”