Big Data Project Planning and Deployment: 10 Suggested Best Practices

 
 
By Chris Preimesberger  |  Posted 2012-11-21
 
 
 

Recognize the Opportunity

Big data is everywhere, so find your opportunity to put it to work. If your organization does not produce "big data," there are still valuable external data sets that, if you could access and analyze them, would provide better operational insight and deliver a new competitive advantage. For example, professionals within the financial service sector can pay a small amount of money to access a huge amount of intra-day trading information (equity exchanges, commodity exchanges, etc.) in order to find key data that can help them make more strategic business decisions. Or a wide array of census and population information is now made available by many government organizations. The world of data has become much more open. How can you participate?

Recognize the Opportunity

Don't Forget Aggregate Data

Many organizations today are in a position to gather data that they already automatically aggregate. And, therefore, potentially valuable insights are locked in that data waiting to be discovered. For example, if you have an application that combines data from many organizations within an industry, this aggregate data likely will yield new macro insight about the industry. Software as a service (SaaS) providers are often powerful examples of aggregate data. Each minute, a SaaS provider gathers potentially valuable information about the health of an industry or the trends in a business process. Occasionally, this data is even more valuable than the underlying SaaS application.

Don't Forget Aggregate Data

Big Data Is Not Just About Volume

Volume is just one key element in defining big data; interestingly, it may be the least important of three elements—the other two being variety and velocity. Taking advantage of a big data opportunity means creating an agile architecture that can work as easily with high-velocity, semi-structured data as it does with batches of traditional (relational) data. Combining any variety of data types to enable new correlations and insights will most often be where big value is created.

Big Data Is Not Just About Volume

Combine Big and Traditional Data Types

This is the Holy Grail within every big data opportunity and where the most rewarding insights can be discovered. Examples of this value can be found in any industry: Combining all known weather data, soil condition and crop-planting schedules can improve farm yields and better manage crop insurance premiums. Another example is mapping social media feedback to CRM buyer profiles, which can help improve buyer loyalty and target premium offers. Today, it seems we are limited here only by the breadth of our imagination. Toward this goal, the big data analytic architecture must be properly designed.

Combine Big and Traditional Data Types

Choose an Architecture That Can Scale

Most big data projects start out small and expand as needed. In Jaspersoft's survey, the company found that 63 percent of respondents said the estimated daily volume of their projects was gigabytes, not terabytes or petabytes. That said, with project success comes the need to grow data volumes as more is collected. New, low-cost data storage and compute infrastructures (often via cloud services) make even large volumes of data readily available. Creating an analytic architecture that can take advantage of this is vital. The axiom becomes the following: Be prepared for expansion when it comes to your big data project.

Choose an Architecture That Can Scale

It Is Not a Popularity Contest

Most people think of Apache Hadoop as the de facto analysis tool for big data. While it is certainly a powerful and useful tool, Hadoop itself does not address every big data need. There are many purpose-built products and services for specific big data use cases; choosing the ones best suited to your task is crucial.

It Is Not a Popularity Contest

The Latency Factor

Your big data business opportunity is best solved by understanding both how quickly the data aggregates and how quickly it must be put to work before it becomes stale. Understanding this acceptable latency is necessary to choosing the right big data architecture. Three useful approaches are data exploration, operational reporting and analytics.

The Latency Factor

Why Data Scientists Are Sought

There is a reason data scientists have become so popular in the last few years: Their deep domain knowledge has become essential for success with big data projects. This is the human intelligence component that combines business expertise with genuine knowledge of the data. Using existing expertise and adding staff that will be able to make sense of the data is essential.

Why Data Scientists Are Sought

Your Current Tools Might Not Be the Best Ones

Don't choose a data store, integration software or reporting and analysis tool just because you've used them before. It's a bold new big data world, and many legacy business intelligence (BI) tools are not well-suited to storing, processing or analyzing modern data types. Not choosing the right tool for the job can lead to failure.

Your Current Tools Might Not Be the Best Ones

Equip Yourself With the Right Tools

The right reporting and analysis tool should drive analysis of both big and traditional-sized data sets (both structured and unstructured data). Additionally, they should scale to reach all those in an organization (with contextually relevant data) who should be empowered to make data-based decisions and infuse real intelligence into processes that enable organizations to compete more favorably on the basis of speed.

Equip Yourself With the Right Tools

Rocket Fuel