2Myth 1: We Must Hire a Hadoop Expert
Hadoop is built on intricate concepts such as MapReduce, YARN, Spark and Hadoop Distributed File Systems (HDFS), and the constant change and announcements of subsystem-level technology further convolute the picture. Plenty of products and tools reduce the complexity and shield users entirely from this. There are open-source application frameworks and commercial products that significantly improve productivity and accessibility when working with Hadoop, up to the point where companies can use internal resources to execute on their big data strategy: enterprise Java developers, data warehouse developers and data analysts can quickly and easily leverage Hadoop.
3Myth 2: Buying a Big Data Solution Means I’m Using Big Data
You’ve just convinced your organization to adopt a big data strategy, and you’ve purchased a solution. What’s next? Enterprises often get stuck at a point where they have the hardware and Hadoop software in place but don’t have the skill set to take advantage of it. Using big data means that you are using your data, executing a data strategy and helping your business with cost savings, revenue opportunities or additional insights. The key is lowering the bar for your organization to execute and deliver data products as quickly as possible. Delivering and running these production applications reliably and on time is the next set of challenges. When you achieve this level, you will know because your users will want more.
4Myth 3: Big Data Is a Fad That Will Go Away in a Few Years
5Myth 4: Businesses Need One Data Scientist for All Big Data Needs
For too long, businesses have been upholding the myth of the data science hero—the virtuoso who slays dragons and emerges with a treasure of an amazing app based on insights from big data. The truth is they can’t afford to rely on a single data scientist or developer because employees can leave an organization at any time. By building a “big data app factory” of processes and teams, companies can ensure that great work can be done over and over again—regardless of personnel changes.
6Myth 5: Traditional Enterprise Data Warehouses Will Go Away
It’s unlikely that the technology of the past will completely go away. Enterprises will continue to rely on traditional enterprise data warehouses (EDWs). However, with the rapid evolution of Hadoop and accompanying products and technologies, the role of the EDW in the enterprise will significantly diminish. The flow of data will change, and it’s likely that Hadoop will be its first stop.
7Myth 6: Apache Spark Is the Future of Hadoop
As usual, the new, sexy young object is always the most alluring. Apache Spark is currently one of those: It is a fast and general engine for large-scale, clustered data processing. However, rest assured, another will come along and take its place as the hottest thing on the market. What people often forget is that old reliable is old and reliable for a reason, as it usually has the breadth and depth needed to move your big data project forward. Resist the urge to move to the latest; if it ain’t broke, don’t fix it. Stick with what you know.
8Myth 7: Big Data Is Only for the Largest of Enterprises
The “big” in big data is misleading. Everyone—including organizations large and small—is in the business of data. Sure, large enterprises collect massive amounts of data, but the abundance of data that small enterprises can collect and leverage for competitive advantage also can be immense. Just because your data may be small in volume does not mean you shouldn’t have a data strategy in place.
9Myth 8: Big Data Is for Hadoop Experts
Enterprises today are rapidly adopting Hadoop to process, manage and make sense of growing volumes of data, and enterprises are now leveraging existing internal resources to drive their data strategies forward. There are now mature, reliable tools readily available for all software engineers to use to unlock the full potential of big data and Hadoop. As a result, no Hadoop expertise is required.