Syncsort's Hadoop ETL Solutions Provide Simplified Data Integration

By Darryl K. Taft  |  Posted 2013-05-24 Print this article Print

Similarly, DMX-h ETL Edition achieved sustainable throughput in excess of 255MB per second per node for up to 2.5 times faster performance than Pig when aggregating 2TB of Web log data. In both cases, tests were run for data volumes ranging from 500GB to 2TB of data. While alternatives such as Hadoop's native sort and Pig reach a saturation point—where throughput starts to decline—at around 500GB of data, DMX-h delivered sustainable and predictable performance from 500GB to 2TB, Syncsort said. This represents major implications organizations as they can more efficiently size their Hadoop infrastructure, minimize uncertainty and achieve a more predictable cost structure as their big data becomes even bigger.

"Hadoop is lowering the cost structure of processing data at scale, but deploying Hadoop at the enterprise level is not free, and significant hardware and IT productivity costs can damage ROI," ESG's Quinn said in a statement. "Syncsort's Spring '13 release provides unique capabilities in Hadoop to help maximize savings, delivering best-in-class ETL technology at a price point that is highly disruptive for the data integration market, and more consistent with the cost structure of open-source solutions."

Meanwhile, TagMan has a marketing data platform providing a software-as-a-service (SaaS) solution to help e-commerce ventures manage the tracking of their marketing campaigns to help them get the full picture of their marketing effectiveness. They facilitate visibility and reduced maintenance by managing vendor marketing tags using a single container tag to get all the information the advertiser wants to track, and tying together all the collected marketing data with other data to provide insights into the effectiveness of different campaigns in the full path to conversion.

The TagMan data management production environment is currently a hybrid of an in-house developed data collection system and a traditional SQL database reporting system. However, TagMan is looking to Hadoop, which they have implemented in parallel, to leverage its horizontal scalability to be able to add nodes, and give them the necessary flexibility to add new data points they want to analyze.

Ultimately, it allows them to handle more data more easily and efficiently when collecting massive amounts of data in real time. This enables them to create actionable intelligence on increasing big data and enable their clients to make minute-by-minute decisions on marketing decisions, such as real-time bidding and search optimization. TagMan sees Syncsort's DMX-h ETL Edition as a fit with their Hadoop plans because the toolset makes it easier to anticipate and handle the required MapReduce processing—data collection and distribution. Company officials said that when they know how they want to slice the data, Syncsort can make it easy to do.

"In tag management, we facilitate a huge number of interactions between marketers and their vendors, and as a result, we are able to see the complex journey a consumer takes prior to making a purchase," said Ave Wrigley, CTO of TagMan, in a statement. "This involves a huge amount of data processing. To be competitive, we must convert the high volume of 'path-to-purchase' data captured by our platform into actionable intelligence that drives decisions by both marketers and their vendors. What's compelling about Syncsort's latest DMX product deliveries is the unique approach to replacing older code-driven approaches with a streamlined, GUI-driven way to collect, cleanse and distribute information inside and outside Hadoop, saving time and resources and giving us maximum flexibility in preparing big data for business analytics and data visualization."

Users looking to leverage DMX-h ETL can download a free test drive that contains everything they require without the need to set up their own Hadoop cluster. It includes a Linux Virtual Machine with Cloudera CDH 4.2 and DMX-h ETL Edition preinstalled, along with use case accelerators and sample data.



Submit a Comment

Loading Comments...
Manage your Newsletters: Login   Register My Newsletters

Rocket Fuel