Big Data Programs: How to Get Started
Getting Started: Define the Big Data Starting Point and Compelling Business Drivers
Structured and unstructured data are growing at an annual rate of 60 percent, according to IDC. Enterprises are awash with data, easily amassing terabytes and even petabytes of information in all formssensors, photos, videos, online transactions, and cell phone signals, to name a few. It is easy to get overwhelmed on where to start, so have a clear idea and established goals for return on investment (ROI) on your first project, and make sure you have line of sight to your next one, so you can use the same tool to handle them. Defining business-driven opportunities and establishing the end goal for the larger organizationwhether it is cost savings, increased ROI, or reduced riskis the key first step in establishing a big data program.
Discover the Data That Needs to be Analyzed and Where It Is Located
Depending on the industry, different kinds of data will need to be located and analyzed in order to reap the full benefits of a big data program. Using data that is already on hand or in your control is a good first step. Look for the obvious hunches that can now be validated since the data types, formats and sizes (volume and variety) that have been blocking issues can now be handled. Examples of this include correlation of behavioral data (such as Web interactions) with transactional history and trends.
Expect to Iterateand Occasionally Fail
Simply stated, you need to plan for variability and that may require "unlearning" some of your best practices from the traditional data warehouse world. When working with both new technologies and new or under-utilized data sources, it shouldn't be a surprise that you will be learning as you go.Â Expect to experiment, take a try/learn/refine approach to reaching new insight. Make sure to staff the project with people who understand, and ideally, thrive on dynamic environments.
Change Who Is Data-Associated
Where database administrators (DBAs) were once the only ones who managed data, today, more hands are in the data management pot. From students taking data management and analytics courses, to data scientists asking the right questions about the data, to CMOs making decisions based on data insights, data is touching every aspect of an organization. You need to plan to make access available, including visually-oriented tools. Don't set the hurdle to gain access to the initiative at the Java programmer level.
Take Advantage of New Technology
The relational database was once the sole solution for data management, but now there are many tools that provide businesses with data management options for their unique needs. There are hardware and software solutions for big data management, and there are also technologies under the category of NoSQL being explored due to their ease of use and relatively modest requirements. A variety of analytics options, including streams, time series and operational are also at users fingertips, depending on what type of data computing is desired.
Plan for Success
Once your organization sees the insight that is possible from a big data program, they are going to demand it right away. It is important to plan on how to move from the experimentation phase to ongoing "production" in order to address the program's success. Even if you don't put the same rigor around a normal production system, have an idea of how you'll support it on an ongoing basis. Make sure you have additional hardware and rack space available and know where you'll get support.
Establish Your Integration and Data Movement Needs
Once you move past your initial use cases, you'll be moving data to and from these environments.Â One-time loading won't cut it, so you'll need to establish how the data can be subscribed and published. Furthermore, you'll need to understand how jobs that cross systems are implemented and managed. Now is the time to start thinking hard about data quality and lineage since you will need them.
Ensure the Right Team and Skills Are in Place
When implementing a big data program, having the right people with the right skills is just as important as having the right technologies. Building out a data scientist role or data science team that works across the company looking for actionable trends in the data and fostering collaboration is becoming increasingly important.Â This team would work directly with the CIO, advising them on how to derive maximum business value from big data and how to integrate new information.
Monitor Progress and Results on an Ongoing Basis
Once the plan is in motion, the organization can then take the results of these first steps and apply what they have learned to a new opportunity. Progress must be monitored on a consistent basis to adapt to organizational and industry changes in order to stay competitive.