By employing the popular open-source data architecture, customers using analytics from SAS can increase the value of big data assets.
Hadoop joins more than three dozen supported data sources in SAS Enterprise Data Integration Server, including Oracle, DB2, SQL Server, Teradata (including Teradata Aster), Sybase, Netezza, EMC Greenplum and MySQL. SAS support for Hadoop access is a key requirement for many organizations that are adding Hadoop to their environment. These enterprises include Macys.com, SAS officials said.
“Hadoop is facilitating big data analytics at Macys.com as our data assets continue to grow exponentially, Kerem Tomak, vice president of marketing analytics at Macys.com, said in a statement. SAS Hadoop support will let us fully leverage our analytics talent, our data and our long-term investment in SAS. SAS with Hadoop is critical to our big data plan.”
Sponsored by the Apache Software Foundation, Hadoop is an open-source Java-based framework for processing large data sets in a distributed computing environment. SAS integrates with the Apache Hadoop distribution.
SAS’ deep integration with Hadoop applies the parallelism of MapReduce, the distributed computing framework commonly associated with Hadoop, the company said. SAS, Hadoop and data warehouse infrastructure Hive match well in analyzing large data sets, simplifying the most common big data analysis and analytic use cases, the company said.
The SAS Hadoop integration means SAS “write-once, run-anywhere” extends to Hadoop deployments. Also, SAS featuressuch as job flow builder, visual editor, syntax checker and othersare extended to Hive, Pig, MapReduce and Hadoop Distributed File System (HDFS) commands. In addition, SAS augments native Hadoop security with SAS data security provisions, including authorization and data lineage. And SAS supports popular Hadoop distributions, such as Cloudera, HortonWorks and EMC Greenplum.
Moreover, SAS data quality and profiling cover data moving in or out of Hadoop. SAS access extends SAS capabilities, such as visual analytics explorer, text mining and analytics to Hadoop data. And Hadoop data can be federated along with data from other sources, including the ability to embed the federated query in a data management job flow.
“Hadoop is becoming more important as more organizations evaluate its capabilities and plan for increased deployment,” said Jim Davis, senior vice president and chief marketing officer at SAS, in a statement. “Bringing powerful SAS Analytics to Hadoop takes advantage of its distributed processing capabilities and helps effectively manage Hadoop deployments.
“Hadoop lacks good tools to develop and manage complex deployments. SAS’ extensive data and analytics management software helps enterprises pull value from Hadoop deployments using minimal resources,” added Davis.
“Hadoop’s value is in taking very large data collectionsfrom simple, regular data to complex, unstructured dataand [processing] it quickly,” Carl Olofson, IDC research vice president for application development and deployment, said in a statement. “IDC expects commercial use of Hadoop to accelerate as more established enterprise software providers such as SAS make Hadoop accessible and easy to use.”
Meanwhile, SAS Information Management will deliver greater support for big data, data governance, master data management and decision management this year. Advanced analytic enablement will grow as analytic processing increasingly moves into databases, SAS officials said.
“SAS Information Management enables customers to exploit and govern information assets, resulting in competitive differentiation and sustained business success,” said Mark Troester, a SAS IT/CIO strategist. “SAS Information Management uniquely integrates management of data, analytics and decision processes across the entire information continuum.”
SAS Information Management delivers data managementincluding data governance, data integration, data quality and master data management. It also delivers analytics management and decision management.