Applications

Nine Best Practices for Working With Hadoop in the Enterprise

May 12, 2015

eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Prev Next

1Nine Best Practices for Working With Hadoop in the Enterprise

by Chris Preimesberger

2Promote Collaboration Around Performance

When there’s a performance problem on Hadoop, there can be several culprits: your code, your data, your hardware or the way you’re sharing resources (your cluster configuration). At a startup, a single person—data scientist, developer and operator all rolled into one—might be responsible for all that. But at a large enterprise, multiple teams have to cooperate to figure out what went wrong and how to fix it. If you’re managing a big data operation at a large, distributed organization, nurture collaboration by giving your team tools that let developers, operators and managers work together to address performance issues.

3Make It Easy to Share Application Context Around Errors

When execution errors do arise, teams can spend hours tracking down what went wrong if all they’ve got are Hadoop’s log files and Job Tracker. Invest in tools that help your team quickly connect errors to application context—where in your code they’re happening—and share that information easily.

4Monitor the Fleet, Not the Vehicle

To an operator running hundreds or thousands of applications on a Hadoop cluster, all of them look the same—until there’s a problem. So you need tools that let you look at performance over groups of applications. Ideally, you should be able to segment performance tracking by application types, departments, teams and data-sensitivity levels.

5Define and Enforce Service-Level Agreements

Monitoring a fleet still means knowing when an individual vehicle performs poorly. Similarly, operators need to set SLA bounds on performance and define alerts and escalation paths when they’re violated. SLA bounds should incorporate both raw metadata, such as job status, as well as business-level events, such as sensitive data access. Successful practitioners of operational readiness also set up metrics that help predict future SLA violations, so they can proactively address and avoid them.

6Understand Inter-App Dependencies

Large, traditional enterprises tend to run their Hadoop clusters as a shared service across many lines of business. As a result, each application has at least a few “roommates” in the cluster, some of which can be detrimental to its own performance. To understand the errant behavior of one Hadoop application, operators must understand what others were doing on the cluster when it ran. Therefore, provide your operations team with as much cluster-related context as possible.

7Ration Your Cluster

To optimize cluster use and ROI, operators must ration resources on the cluster and enforce the limits. An operator can budget mappers for the execution of a particular application, and if the application doesn’t perform appropriately, rationing rules should prevent the application from being deployed. Establishing and enforcing the rules for rationing cluster resources is vital for achieving meaningful operational readiness and meeting SLA commitments.

8Trace Data Access at the Operational Level

Good Hadoop management isn’t only about rationing compute resources; it also means regulating access to sensitive data, especially in industries with heightened privacy concerns like health care, insurance and financial services. Solving for data lineage and governance in an unstructured environment like Hadoop is difficult. Traditional techniques to manually maintain a metadata dictionary quickly lead to stale and old repositories, and they offer no way to prove that a production dataset is dependent on some fields and not on others. As a result, visibility and enforcement on the use of data fields are required at the operational level. If you can reliably track if and when a data field is accessed by an app, your compliance teams will be happy.

9Record Data Misfires

Compliance professionals at large enterprises also want proof that a Hadoop application processed every record in a dataset, and they look for documentation when it fails to do so. Failures can result from format changes in upstream data sets or plain old data corruption. Keeping track of all records that the application failed to process is particularly vital in regulated industries.

10Tune Your Engine Before You Replace It

With new compute fabrics emerging all the time, teams are sometimes too quick to junk their old ones in pursuit of better performance. However, it’s often the case that you can achieve equal or greater performance gains just by optimizing code and data flows on your existing fabrics. That way, you can avoid expensive infrastructure upgrades unless they’re truly necessary.

Prev Next

Nine Best Practices for Working With Hadoop in the Enterprise

1Nine Best Practices for Working With Hadoop in the Enterprise

2Promote Collaboration Around Performance

3Make It Easy to Share Application Context Around Errors

4Monitor the Fleet, Not the Vehicle

5Define and Enforce Service-Level Agreements

6Understand Inter-App Dependencies

7Ration Your Cluster

8Trace Data Access at the Operational Level

9Record Data Misfires

10Tune Your Engine Before You Replace It

Get the Free Newsletter!

MOST POPULAR ARTICLES

9 Best AI 3D Generators You Need...

RingCentral Expands Its Collaboration Platform

8 Best AI Data Analytics Software &...

Zeus Kerravala on Networking: Multicloud, 5G, and...

Datadog President Amit Agarwal on Trends in...

Advertisers

Menu

Our Brands