I attended GigaOm's Structure Data conference last year, and the conference was all about the promise of Hadoop, big data and unstructured data, which is somewhat ironic for an event with "structure" in its title.
This year's event, March 19-20, was more about the deliverables in the form of case studies, user experiences and the limits of what to realistically expect from your big plans for big data. The change from promise to reality is welcome. Here are my five top takeaways from the first day of the event.
1. Big data and the deployment of Hadoop-type infrastructures are as much about process as they are technology. "Hadumping" was one bit of slang making the rounds regarding customers filling their big data lakes with a bunch of data and then trying to figure out what to do with it. The term came from former rocket scientist and now Turner Broadcasting analyst executive Colin Coleman.
He was referring to the temptation to set up a Hadoop-based data infrastructure and then proceed to dump all forms of data into the system without a whole lot of planning on whether you need the data, how you are going to extract what you need and how you are going to analyze the data after it's extracted.
Old-time business intelligence techies take note: Your talents are much needed, just not on the platforms where you learned your trade. The better case studies—including a remarkable privacy initiative from MetLife and Ford's plans to use open source to allow developers to create new applications based on masses of car data—were built around developers getting top-down approval to rethink how their companies use data and then getting the freedom to operate outside the usual new product strictures.
2. Hadoop is still not that easy to implement. In discussions with MetaScale and in a presentation by Alpine Data Labs, the focus was on taking—or at least masking the complexity—and making unstructured data easier to accumulate, integrate and query by the business executives needing answers.
Hadoop has—somewhat unfortunately—acquired the aura of being a magical term that can fulfill all your data needs. This year it's clear that Hadoop and its associated modules are quickly evolving into a platform that holds a lot of appeal to customers but still requires the attributes that platforms need to work successfully in the enterprise.
Security, easy-to-learn tools and hooks into existing corporate systems are all evolving, but are not totally baked at this point. "[Hadoop is] going to break away from the realm of science projects, and start producing valuable insights and analytics that are actually operational," said Steve Hillion, vice president of product at Alpine Data Labs.