Organizations that are concerned with employees who can access data within their Hadoop clusters can now look at SHadoop from Zettaset to create baseline policies.
unveiled SHadoop, a new security initiative that would help organizations
manage and control employee access to data in Hadoop environments.
integrate security functions that would allow administrators to implement
role-based access control within the company's Hadoop Orchestrator platform,
Zettaset said Feb. 22. With the new tools, administrators can create policies
that specify what users can or cannot do with data on a Hadoop platform.
are defined using the user's group and role. Users can be restricted from
executing certain jobs if they aren't classified in a specific category. Users
can also be prevented from importing or exporting certain types of data,
Zettaset said. With SHadoop, administrators can track, log and audit all user
and group activity.
Hadoop is the
best solution for handling big data and is constantly improving, but it hasnt
focused on security, said Brian Christian, Zettaset's CTO.
use the Apache Hadoop Distributed File System to store and manage petabytes and
petabytes of data from disparate data sources. Hadoop is far more efficient at
aggregating and organizing structured and unstructured data than traditional
relational database management systems. Many organizations rely on Hadoop to
aggregate and analyze data collected from Websites, social media, emails, audio
and video, and data gathered from sensors. Big data is used for a broad range
of purposes, including fraud detection and security analysis.
networking giant Facebook is among the many companies that use Hadoop.
At the moment,
Hadoop does not have many built-in controls beyond access control lists and
Kerberos-based authentication. The SHadoop layer would "mitigate known
architectural and input validation issues" and improve user role audit
tracking and user-level security, Zettaset said.
of SHadoop will include a way for organizations to encrypt the data stored in a
Hadoop cluster or as they are being transmitted between Hadoop nodes, according
tools are designed to make it easier and more affordable for small and midsized
businesses to use big data, the company said. Even though open-source versions
of Hadoop are freely available, many organizations shy away from using them because
the task of managing the clusters is still challenging.
In fact, a
recent study sponsored by LogLogic found that many IT professionals are still
very confused about what big data is, and how to work with it. About 38 percent
of survey respondents said they do not have a clear understanding of what big
data is, and nearly half of those respondents were somewhat or very concerned
about managing big data.
More than half
the survey participants, about 59 percent, said they lacked the tools required to
effectively manage data from their IT systems, LogLogic said.
is power, and big data if managed properly can provide a ton of insight to help
deal with security, operational and compliance issues, said Mandeep Khera,
chief marketing officer of LogLogic.