Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Big Data and Analytics
    • Big Data and Analytics

    Using Big Data to Discover the Truth Isn’t as Easy as It Looks

    Written by

    Wayne Rash
    Published September 20, 2016
    Share
    Facebook
    Twitter
    Linkedin

      eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

      NEWS ORLEANS—It was clear when I stood up to speak at the session on big data at the Society for Professional Journalism’s Excellence in Journalism 2016 conference here that I wasn’t addressing your average trade show audience.

      The hundred or so people in front of me were all professional journalists, which meant that they were expecting a no-nonsense, practical look at how they could use vast data archives to find the truth on a wide variety of topics, ranging from political corruption to the spread of the Zika virus.

      With me in the front of the room was Pam Baker, the highly respected author of Data Divination: Big Data Strategies and Louis Lyons, chief operating officer of ICG Solutions, the company that created the LUX2016 data analysis engine and which helped with our examination of viewer reaction to last year’s Democratic and Republican primary debates.

      Our discussion started out with my description of how we used data analysis to figure out who won last year’s debates well before the major news organizations had polling data to release. But as valuable as our first effort to use big data to support a feature article was, the fact is that data analysis goes far beyond what I was able to do in eWEEK’s first attempt.

      This is no surprise because the analysis of large data sets is still in its infancy, and while data analysis can be used by most new organizations to gain important insights, finding the way is still hard.

      Fortunately, I had Pam Baker at my side in this effort, and she was able to part the seas of confusion and teach what big data can do and what it can’t do. The first lesson was that big data analysis isn’t magic, and just because you have a lot of data doesn’t mean it’s useful data.

      What really matters, Baker said, is that your big data archive contains accurate data that is most relevant to the information you hope to discover.

      In addition, Baker pointed out that it’s critical that you know the origin of the data you’re planning to analyze and that you’re comfortable with the way that the data was collected so that you can be more confident that you are working with valid information. One example that Baker cited was the difference in the influenza infection rates reported by the Centers for Disease Control and by Google Flu Trends.

      Google Flu Trends was an effort to determine infection rates using only information obtained on social media. The CDC, on the other hand, compiled influenza infection rates using a variety of sources, including social media, but also including government sources, health care providers and more.

      Using Big Data to Discover the Truth Isn’t as Easy as It Looks

      After a few years of testing, Google Flu Trends was withdrawn because it wasn’t accurate. Social media alone, it seems, is not a good way to track disease.

      However, Baker said that it’s also important that once you’ve performed an analysis you should question the output. She pointed out that many sources provide data that’s already been analyzed and that there can be mistakes in the analysis. She adds that you need to check your own analysis as well, including by doing the basics such as checking the math.

      One way to help make sure your data analysis is more accurate is to diversify your sources, which is what the CDC did with its flu reporting. She added that you have to assume that your analysis will fail and you have to be prepared to figure out what to do next.

      “Data is essential and analytics certainly speed results, but don’t assume results to be infallible,” Baker said. She recommends running any analysis at least three more times to make sure the analysis is correct.

      Fortunately, much of the data you’re likely to need for analysis is readily available. The government has a wealth of information and provides a site, Data.gov, where vast amounts of government data can be found and much of it is very useful.

      Many government agencies, ranging from the CDC to the Federal Communications Commission, have stores of data that are available for analysis, much of the time simply on request. But depending on the data, it’s important to try to confirm the data, just as you would from any other source. Just because it’s from the government doesn’t mean it’s accurate, current or relevant.

      It’s also worth noting that much of the data you may need in business, or in my case in journalism, isn’t in a useful form. It may be in files that need to be converted to a format that can be readily analyzed with the available tools, or the data may be in printed reports where it must be entered manually or scanned to be useful.

      If it looks like using big data may be a lot of trouble, you’re right. There’s nothing magical about big data, including the fact that it’s big. As Baker told me, it’s more important to have the right data than it is to have a lot of data.

      The old adage of “garbage in, garbage out” holds true when it comes to data analysis. Accumulating lots of useless data is still useless, there’s just more of it. But once you do find the right data, and analyze it properly, it can show you things that you can’t find any other way, and that is what makes this technology so valuable to business managers and journalists.

      Wayne Rash
      Wayne Rash
      https://www.eweek.com/author/wayne-rash/
      Wayne Rash is a content writer and editor with a 35-year history covering technology. He’s a frequent speaker on business, technology issues and enterprise computing. He is the author of five books, including his most recent, "Politics on the Nets." Rash is a former Executive Editor of eWEEK and a former analyst in the eWEEK Test Center. He was also an analyst in the InfoWorld Test Center and editor of InternetWeek. He's a retired naval officer, a former principal at American Management Systems and a long-time columnist for Byte Magazine.

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      Get the Free Newsletter!

      Subscribe to Daily Tech Insider for top news, trends & analysis

      MOST POPULAR ARTICLES

      Artificial Intelligence

      9 Best AI 3D Generators You Need...

      Sam Rinko - June 25, 2024 0
      AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
      Read more
      Cloud

      RingCentral Expands Its Collaboration Platform

      Zeus Kerravala - November 22, 2023 0
      RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
      Read more
      Artificial Intelligence

      8 Best AI Data Analytics Software &...

      Aminu Abdullahi - January 18, 2024 0
      Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
      Read more
      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Video

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2024 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×