Hadoop Poses a Big Data Security Risk: 10 Reasons Why | eWeek

Hadoop Poses a Big Data Security Risk: 10 Reasons Why

Hadoop Poses a Big Data Security Risk: 10 Reasons Why
Apr 23, 2013
3 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More


Hadoop Poses a Big Data Security Risk: 10 Reasons Why

Hadoop Poses a Big Data Security Risk: 10 Reasons Why

by Chris Preimesberger


Hadoop Wasn’t Designed for Enterprise Data

2

Like much ground-breaking IT (such as TCP/IP or Unix), Hadoop wasn’t originally built with the enterprise in mind, let alone enterprise security. Hadoop’s original purpose was to manage publicly available information such as Web links, and it was designed to format large amounts of unstructured data within a distributed computing environment, specifically Google’s. It was not written to support hardened security, compliance, encryption, policy enablement and risk management.


Hadoop’s Security Relies Entirely on Kerberos

3

Hadoop does utilize Kerberos for authentication. However, this protocol can be difficult to implement, and it doesn’t cover a number of other enterprise security requirements, such as role-based authentication, LDAP and Active Directory for policy enablement. Hadoop also doesn’t support encryption on nodes or on data in transit between nodes.


Advertisement

Hadoop Clusters Consist of Many Nodes

4

Traditional data security technologies have been built on the concept of protecting a single physical entity (like a database or server), not the uniquely distributed big data computing environments characterized by Hadoop clusters. Traditional security technologies are not effective in this type of distributed, large-scale environment.


Traditional Backup/Disaster Recovery Isn’t the Same in Hadoop

5

The distributed nature of Hadoop clusters also renders many traditional backup and recovery methods and policies ineffective. Companies using Hadoop need to replicate, back up and store data in a separate, secured environment.


Hadoop Is Rarely Used Alone

6

To reap the benefits of big data, Hadoop is used in conjunction with other technologies such as Hive, HBase or Pig. While these tools make big data accessible and usable, most also lack any real enterprise-grade security. Hardening Hadoop itself is only one part of the big data security challenge.


Compliance Mandates Still Apply in Big Data Workloads

7

Big data doesn’t come with a separate set of regulations and mandates. Regardless of the IT used to store and manage data, enterprise organizations must still comply with regulatory requirements for data privacy and security such as HIPAA (health care), PCI (credit industry) and SOX—even though approved traditional security technologies fail to fully address the challenges of big data environments.


Advertisement

Cost of a Breach Undetermined

8

So far, no one has been able to put an accurate number on how much a security breach can cost an organization. Without thoroughly evaluating its security coverage, an enterprise cannot assess its security weaknesses nor determine exactly how much to spend on security coverage.


Big Data Users on Their Own With Security

9

Best practices for companies with Hadoop clusters include implementing additional access controls and limiting the number of personnel allowed to access the cluster.


Additional Steps Needed to Protect Data Cluster

10

This will be the case until IT that addresses the vulnerabilities of a Hadoop environment is made available. Organizations must regularly scan their cluster environment for vulnerabilities. They also must make it a best practice to replicate and back up data while storing it in a separate secured environment.


Hadoop Users Must Keep Up to Date

11

As large data load batch processing becomes mainstream in the enterprise, new IT is coming out all the time—from established companies as well as startups—that is designed to make big data workloads more useful for businesses. Best practices for IT managers should always include regular visits to sites such as eWEEK, which covers all these relevant sectors of big data IT: security, storage, servers and data center systems as a whole.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.