Syncsort Delivers Native Mainframe Hadoop, Spark Data

Syncsort simplifies mainframe big data access for enterprises seeking governance and compliance in Apache Hadoop and Apache Spark data.

big data

Syncsort, a provider of big data and mainframe software, has upgraded its DMX-h data integration software to enable enterprise organizations to work with mainframe data in Hadoop or Spark in its native format.

Syncsort delivered the new capabilities because some of its large enterprise customers—particularly those in financial services, banking, insurance and health care—needed to maintain their mainframe data in its native format for compliance purposes, the company said.

Tendu Yogurtcu, general manager of Syncsort's big data business, told eWEEK that while many of Syncsort's large enterprise customers want the scalability and cost benefits of Hadoop and Spark for their mainframe data, converting that data for the big data platforms presents compliance challenges because they are required to preserve the data in its original EBCDIC format.

"With this announcement we are basically saying that many of our use cases, which are based on our collaboration with very large customers in financial services, banking and insurance, where regulatory compliance is very critical, need access to their mainframe data in native format," Yogurtcu said. "They want to get access to this mainframe data. However, changing the data format can cause governance and compliance issues. This new feature involves making Hadoop understand this EBCDIC encoded mainframe record format."

The mainframe data has to remain in mainframe format for audit purposes or for archival purposes, she said.

Yogurtcu said that last summer Syncsort open-sourced some Apache Spark packages and mainframe connectors to make mainframe data available for interactive queries as Spark SQL. "And all of these moves, until now, required that that mainframe data be EBCDIC encoded in mainframe-specific format to be translated into something an open system can understand," she said.

The technology Syncsort open-sourced is an IBM z Systems mainframe connector for Apache Spark. The contribution enables enterprises to access and get new insights from their mainframe data with Apache Spark's analytics capabilities and Spark SQL. Yogurtcu said Syncsort is betting that Spark will play a major role in next-generation use cases including the Internet of things. This added to the company's push to transform mainframe data into a format that is easily understandable by Spark. Syncsort's mainframe connector for Spark is similar to the Apache Sqoop mainframe connector that Syncsort released as open source in 2014.

Moreover, based on results of a survey from January of this year, Syncsort identified Spark as one of the key hot trends for 2016. According to the survey, nearly 70 percent of respondents said they are most interested in Apache Spark. Interest in Spark surpassed interest in all other compute frameworks, including the recognized incumbent, MapReduce, which was noted by 55 percent of respondents.