11 Things IT Managers Considering Hadoop Deployment Should Know

 
 
By Chris Preimesberger  |  Posted 2016-01-15 Print this article Print
 
 
 
 
 
 
 
 
 
  • Previous
    1 - 11 Things IT Managers Considering Hadoop Deployment Should Know
    Next

    11 Things IT Managers Considering Hadoop Deployment Should Know

    At the center of the big data movement, Hadoop offers a platform for managing a variety of workloads, from data storage to real-time stream processing.
  • Previous
    2 - Why Consider Using Hadoop in the First Place?
    Next

    Why Consider Using Hadoop in the First Place?

    Apache Hadoop enables big data apps for both operations and analytics and is one of the fastest-growing technologies providing competitive advantage for businesses across industries. Hadoop is a key component of the next-generation data architecture, providing a massively scalable distributed storage and processing platform. Hadoop enables organizations to build new data-driven applications, while freeing up resources from existing systems.
  • Previous
    3 - Is Your Organization Ready for Hadoop?
    Next

    Is Your Organization Ready for Hadoop?

    Key considerations critical to the success of a big data/Hadoop project are upper management's commitment and vision to use data to generate new sources of revenue; the presence of data governance programs and other enterprise programs that can guide Hadoop (and other data-driven projects); and whether business-driven use cases for Hadoop have been identified and agreed upon.
  • Previous
    4 - Are Unstructured Data Sets Increasing in Your Organization?
    Next

    Are Unstructured Data Sets Increasing in Your Organization?

    The largest driver of big data solutions is the development of unstructured data, which doubles about every two years. Most organizations have so much unstructured data that it is unlikely they will analyze all of it. The proliferation of this unstructured data is difficult for traditional systems to capture, store and process, yet Hadoop can do so very easily and in a cost-effective manner.
  • Previous
    5 - Is There an Increase in Data Sources?
    Next

    Is There an Increase in Data Sources?

    We're talking about data from Websites, sensors, non-traditional compute devices, social networks and so on. IDC estimates that by 2020 there will be 32 billion connected devices because of the growth in the Internet of things (IoT). By capturing vast amounts of information from new and different data sources and using analytics on the information from these different sources, enterprises can obtain critical insights into the strengths and weaknesses of their business, identify growth opportunities for new product lines and act on the data.
  • Previous
    6 - Do You Have an Identified Use Case for Hadoop?
    Next

    Do You Have an Identified Use Case for Hadoop?

    For organizations with a need to quickly gain insights and create opportunities from big data, the first step will be to choose the best business solutions, as well as infrastructure technologies that will support fast data and big data at scale and, in turn, enhance operational applications. It is important to start with a small project. For example, develop and perform an initial test before adding more data. Taking on too much will lead to higher-than-expected costs.
  • Previous
    7 - Would Combining Operational/Analytic Data Sets for New Apps Be Beneficial?
    Next

    Would Combining Operational/Analytic Data Sets for New Apps Be Beneficial?

    It takes a certain critical mass of big data volume before exploring and profiling that data produces an accurate assessment of big data's unique value to an organization. If an organization is already committed to capturing, governing, and analyzing big data and already has solid competencies in data management and analytics, proceeding to a Hadoop project can be very positive.
  • Previous
    8 - Is Management Interested in Saving Money Using All Its Data?
    Next

    Is Management Interested in Saving Money Using All Its Data?

    The power of big data solutions continues to grow significantly every year, and the cost of collecting, managing and storing data also is increasing. As a result, organizations are rethinking their enterprise architecture to find ways to reduce cost and increase the flexibility of their data management/storage processing solution. As Hadoop deployments grow within an organization, the architectural differences between Hadoop distributions begin to show dramatic cost differences across capital and operational expenses. These differences can reduce total cost of ownership by 20 to 50 percent.
  • Previous
    9 - Do You Have a Culture Willing to Adopt New Technologies?
    Next

    Do You Have a Culture Willing to Adopt New Technologies?

    Prior to taking on a big data project, most organizations will need to look at and upgrade their analytical skills. That means that the organization must view analytics as central to solving problems and have the ability to identify opportunities. Adult learners who participate in real-world, analytics-based decisions that let them learn by doing result in more successful projects.
  • Previous
    10 - Do You Have Terabyte-Plus Data Sets?
    Next

    Do You Have Terabyte-Plus Data Sets?

    Of all the "Vs" for big data (volume, variety, velocity and veracity), volume is the most relative. One organization's big data might be a few hundred gigabytes, while another might not consider Hadoop or big data until they reach a petabyte of data. Some organizations might want to use all their data (structured, unstructured, poly-structured) in a Hadoop environment, and some may just want to use the unstructured data sets. Practical advice suggests setting some limit on the amount of data for a first-time technology deployment; Hadoop is no different in this regard.
  • Previous
    11 - Are Your Existing Tools Delivering a Full View?
    Next

    Are Your Existing Tools Delivering a Full View?

    One of the main benefits of Hadoop is the ability to combine legacy operational data stores with new analytic data sets in a data lake. These data sets could be queried for insight within the data lake and/or be prepared for further analysis within visualization tools. If existing tools aren't able to dip into the data lake and deliver a full view of all the combined data sets and sources, it might be time to look to the Hadoop ecosystem or for adopting a converged data platform.
  • Previous
    12 - Is Price per TB of Conventional DB/Data Warehouse Tools Too High?
    Next

    Is Price per TB of Conventional DB/Data Warehouse Tools Too High?

    Data that was previously too expensive to store is now available for analysis to improve business insights, at 1/10th to 1/50th of the cost on a per-terabyte basis. With corporate data growing at around 40 percent per year, traditional systems are unable to cope and scale affordably. Hadoop enables the economical capturing and storing of data from every touch point in an organization, while eliminating separate silos to process that data.
 

Apache's Hadoop batch analytics software platform isn't new—it's been used in IT production for the better part of the last decade. But the business world often is intentionally slow and careful about moving to new IT tools and services. However, businesses increasingly are discovering the value of Hadoop in putting to good use all of the diverse business data (from customers, partners, contractors, social networks, employees and so on) that gets stored up and often sits latent. Hadoop is at the center of the big data movement and provides a general-purpose data management platform for a variety of workloads, including data storage, integration from multiple sources, database operations, analytics, search and real-time stream processing. It uses multiple computing nodes to capture valuable insight from big data sets in order to facilitate analytics and impact business as it happens. But how does one determine if Hadoop will be used effectively within a system? This eWEEK slide show, which includes input from William Peterson, director of product marketing at Apache Hadoop specialist MapR Technologies, examines factors IT managers should keep in mind when considering Hadoop deployment.

 
 
 
 
 
 
 
 
 
 
 

Submit a Comment

Loading Comments...
 
Manage your Newsletters: Login   Register My Newsletters























 
 
 
 
 
 
 
 
 
Rocket Fuel