Analytics solution provider SAS introduces Hadoop support for customers working with big data.
Analytics
software provider SAS
has increased access to big data sources with Hadoop
support in its updated SAS Enterprise Data Integration Server.
By employing
the popular open-source data architecture, customers using analytics from SAS
can increase the value of big data assets.
Hadoop joins
more than three dozen supported data sources in SAS Enterprise Data Integration Server, including
Oracle, DB2, SQL Server, Teradata (including Teradata Aster), Sybase, Netezza,
EMC Greenplum and MySQL. SAS support for Hadoop access is a key requirement for
many organizations that are adding Hadoop to their environment. These
enterprises include Macys.com, SAS officials said.
"Hadoop
is facilitating big data analytics at Macys.com as our data assets continue to
grow exponentially, Kerem Tomak, vice president of marketing analytics at
Macys.com, said in a statement. SAS Hadoop support will let us fully leverage
our analytics talent, our data and our long-term investment in SAS. SAS with
Hadoop is critical to our big data plan."
Sponsored by
the Apache Software Foundation, Hadoop is an open-source Java-based framework
for processing large data sets in a distributed computing environment. SAS
integrates with the Apache Hadoop distribution.
SAS' deep
integration with Hadoop applies the parallelism of MapReduce, the distributed
computing framework commonly associated with Hadoop, the company said. SAS,
Hadoop and data warehouse infrastructure Hive match well in analyzing large
data sets, simplifying the most common big data analysis and analytic use
cases, the company said.
The SAS Hadoop
integration means SAS "write-once, run-anywhere" extends to Hadoop
deployments. Also, SAS featuressuch as job flow builder, visual editor, syntax
checker and othersare extended to Hive, Pig, MapReduce and Hadoop Distributed
File System (HDFS) commands. In addition, SAS augments native Hadoop security
with SAS data security provisions, including authorization and data lineage.
And SAS supports popular Hadoop distributions, such as Cloudera, HortonWorks
and EMC Greenplum.
Moreover, SAS
data quality and profiling cover data moving in or out of Hadoop. SAS access
extends SAS capabilities, such as visual analytics explorer, text mining and
analytics to Hadoop data. And Hadoop data can be federated along with data from
other sources, including the ability to embed the federated query in a data
management job flow.
"Hadoop
is becoming more important as more organizations evaluate its capabilities and
plan for increased deployment," said Jim Davis, senior vice president and
chief marketing officer at SAS, in a statement. "Bringing powerful SAS Analytics
to Hadoop takes advantage of its distributed processing capabilities and helps
effectively manage Hadoop deployments.
"Hadoop
lacks good tools to develop and manage complex deployments. SAS' extensive data
and analytics management software helps enterprises pull value from Hadoop
deployments using minimal resources," added Davis.
"Hadoop's
value is in taking very large data collectionsfrom simple, regular data to
complex, unstructured dataand [processing] it quickly," Carl Olofson, IDC
research vice president for application development and deployment, said in a
statement. "IDC expects commercial use of Hadoop to accelerate as more
established enterprise software providers such as SAS make Hadoop accessible
and easy to use."
Meanwhile, SAS
Information Management will deliver greater support for big data, data
governance, master data management and decision management this year. Advanced
analytic enablement will grow as analytic processing increasingly moves into
databases, SAS officials said.
"SAS
Information Management enables customers to exploit and govern information
assets, resulting in competitive differentiation and sustained business
success," said Mark Troester, a SAS IT/CIO strategist. "SAS
Information Management uniquely integrates management of data, analytics and
decision processes across the entire information continuum."
SAS
Information Management delivers data managementincluding data governance, data
integration, data quality and master data management. It also delivers analytics
management and decision management.
Darryl K. Taft covers the development tools and developer-related issues beat from his office in Baltimore. He has more than 10 years of experience in the business and is always looking for the next scoop. Taft is a member of the Association for Computing Machinery (ACM) and was named 'one of the most active middleware reporters in the world' by The Middleware Co. He also has his own card in the 'Who's Who in Enterprise Java' deck.