Though the issue of big data may appear to be overexposed to some, the message coming from one tech analyst firm is "Do Believe the Hype."
Sentiment surrounding big data vendors remained positive last year, despite skeptics' suggestions that the subject had been "overhyped," according to Ovum. The big data buzzword even managed to transcend from the enterprise IT world to become a hot topic for business publications and journals in 2012, with MongoDB claiming considerable mindshare among Web developers who traditionally relied on MySQL.
In a new research note, Ovum analyst Tony Baer reported on the analysis of data gathered by DataSift, which ranked Twitter mentions and sentiment of vendors and open source organizations associated with the big data market in 2012. According to Baer, Ovum analysts were surprised to find that while Hadoop garners much of the spotlight as a big data platform, the vendor 10Gen, which develops MongoDB, came in second in mentions to Apache, which hosts the Hadoop project. DataSift conducted a retrospective analysis of vendor mentions on Twitter during 2012 for Ovum.
"Given the level of buildup and suggested hype, it surprised us that sentiment expressed about big data vendors still remained so positive in 2012," Baer said in a statement. "What's also interesting is the degree to which big data became a business, not just a technology story in 2012," Baer said, noting that some business portals, such as Forbes and Harvard Business Review, edged out popular IT news portals, in terms of coverage of this traditionally "techie" subject.
While positive mentions of big data vendors outnumbered negative mentions by a margin of 3-to-1, negative sentiment spiked in November with headlines over Hewlett-Packard's troubled acquisition of Autonomy. Not surprisingly, given that vendors accelerated the pace of product announcements during 2012, 60 percent of Twitter activity occurred in the second half of the year. In all, the analysis reflected 2.2 million Twitter interactions from more than 981,000 authors.
The Twitter data provided a good glimpse into vendor brand recognition with big data. 10Gen, which develops the popular MongoDB document-oriented NoSQL database, scored high in mentions, trailing only the Apache Foundation. Others such as IBM and Teradata were also well represented in the Twitter stream, trailing only behind Apache and 10Gen in positive mentions. Splunk, which is associated with machine data and, like 10Gen, is also popular among developers, also scored high, showing that there is growing awareness about harnessing "the Internet of things" to generate business insights. Splunk also has been mentioned as a possible acquisition candidate by IBM and Oracle as the two companies vie to expand their big data and business analytics capabilities.
Ovum believes that the popularity of 10Gen is more indicative of the future of Web development rather than Big Data, per se. We view 10Gen as becoming the nontransactional database successor to MySQL in the world of Web developers."
Moreover, "While Twitter streams are not a scientific focus group for detecting brand awareness, they provide a valuable window on market thinking," Baer said in a statement. "The data showed that while some players, such as IBM and Teradata, successfully scored high recognition in Twitter mentions, other enterprise players need to better focus their message to get big data recognition."
More and more companies are relying on social media such as Twitter to tap into sentiment about their products and services as well as those of competitors. IBM has made this a core part of its social business push, which will be on display at the upcoming IBM Connect 2013 conference, beginning Jan. 27 in Orlando, Fla.
Meanwhile, beyond the "hype" around big data, Ovum's Baer notes that it should not be overdone. "Big data is very real," Baer told eWEEK. "The reality of all that data being there and the ability to get to it.
Baer spoke about a firm called ClearStory he came across at a New York big data meetup group on Jan. 22. ClearStory is working on curating publicly available APIs to data out in the ether, he said.
"The data is there, it's becoming more accessible, and increasingly, there is a tooling ecosystem to civilize it," said Baer. "But we can too easily get ahead of ourselves here."
He described how one data scientist at the meetup said something like "If you run a bar, all you need to do is see if you're running low on gin. You don't need all those fancy algorithms."