Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Subscribe
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Subscribe
    Home Applications
    • Applications
    • Cloud
    • Database
    • Development

    New Big Data and Hadoop Projects: 10 Tips for Keeping on Track

    By
    Chris Preimesberger
    -
    December 9, 2013
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      PrevNext

      1New Big Data and Hadoop Projects: 10 Tips for Keeping on Track

      1 - New Big Data and Hadoop Projects: 10 Tips for Keeping on Track

      by Chris Preimesberger

      2Figure Out What You’re Trying to Solve

      2 - Figure Out What You're Trying to Solve

      You can’t use your data if you don’t know what you want to do with it. With this understanding, you will be able to steer your company in the right direction. Figure this out early and stick to the plan.

      3Define Your Business Questions

      3 - Define Your Business Questions

      These include questions about the target audience, how best to market to it, how to expand market reach, how to be effective with costs, and how to engage and interact with customers in the most positive way possible. These categories involve varying amounts of data. They are all crucial to discovering what problems do exist so that they can be understood and solved for the betterment of the company.

      4Stay Focused on the Most Important Questions First

      4 - Stay Focused on the Most Important Questions First

      This is not easy because all questions are important in their own right. Prioritize and stay focused. Questions will evolve, and new ones will be added.

      5Get Help From People Who Know What They’re Doing

      5 - Get Help From People Who Know What They're Doing

      You’ll need a technical expert who knows the ins and outs of the project and how the solution is to be built. If your technical expert isn’t well-versed on the business side, get someone who is, someone who knows every aspect of the business model, finances, the products or services, and how everything is tied together.

      6Know Where Your Data Emanates

      6 - Know Where Your Data Emanates

      If you’re using the data for suggestive selling, you’re probably drawing on user events, products viewed, click-throughs and site referrals. If you’re looking to streamline your supply chain, you almost certainly have data pertaining to raw materials, supplier key performance indicators, bills of lading, warehousing and even driver performance. Knowing this will help you figure out how much data you’ll have.

      7Invest in Understanding the Data

      7 - Invest in Understanding the Data

      Where is it, and which data is coming from where? The best way to handle this is the process of data profiling. Also, expect schema changes and plan for your system to be able to handle them. If you can identify the problem areas at the beginning, it will be less difficult and take less time to handle them up-front, as opposed to once the system is built.

      8Storing the Data

      8 - Storing the Data

      Once you know where data is coming from and how much you’ll potentially have, you’ll have a good idea of how it should be stored. Maybe the data isn’t expected to grow all that much, so you don’t need something scalable. Perhaps you collect massive amounts of data on a daily basis, so going with something cloud-based for maximum scalability is the way to go.

      9Processing the Data

      9 - Processing the Data

      What’s being analyzed? Structured data such as log files, semi-structured emails or tweets, or unstructured data, such as satellite feeds, or all of the above? If you’re going with the first option, good old SQL Server might be what the doctor ordered, but if you need to process at least one other variety of data, Hadoop might be the most effective solution.

      10Expect Data Corruption and Bad Data in General

      10 - Expect Data Corruption and Bad Data in General

      Whether it’s due to human error or bugs, you will have bad data. Plan for this up-front; it will save headaches in the long run. Look closely at de-duplication, data-combers and other quality-assurance software.

      11Design and Implementation

      11 - Design and Implementation

      This is often a major stumbling block. Personnel or financial decisions will have to be made. With Hadoop, for example, if you have the trained manpower to spare, it will cost less than if you have to contract with someone to build it. If no one possesses the required skill set, they’ll need to learn it. But if pulling programmers away from their current tasks and spending a lot on training or a contractor is not an option, a software-as-a-service (SaaS) subscription platform may be the best alternative.

      PrevNext

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.