Close
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Database
    • Database

    Syncsort’s Hadoop ETL Solutions Provide Simplified Data Integration

    By
    Darryl K. Taft
    -
    May 24, 2013
    Share
    Facebook
    Twitter
    Linkedin

      Syncsort, a provider of big data integration and protection solutions, recently announced the availability of its Spring ’13 release, including two brand-new Hadoop products and enhancements to its DMX technology that turn Hadoop into an easy-to-use extract, transform and load (ETL) solution.

      Big data is prompting organizations to look at Hadoop to process more data in less time and for less money, but Hadoop is not yet a complete ETL solution. Syncsort’s two new offerings for Hadoop–DMX-h ETL Edition and DMX-h Sort Edition are designed to strengthen Hadoop by providing the full functionality required to deliver enterprise ETL capabilities. They provide greater ease-of-use and maximize node performance, compared with non-native, code-generating ETL tools. In addition, performance and connectivity enhancements to DMX expand usage by end users and partners.

      “Analyzing big data is critical to our customers’ ability to sustain competitiveness, but the avalanche of information is breaking traditional data integration architectures—many of the tools are too code- and resource-intensive and ultimately drive costs too high,” said Josh Rogers, senior vice president of the data integration business at Syncsort, in a statement. “With our new DMX editions, we are strengthening Hadoop by providing seamless and powerful ETL and sort capabilities and at the same time, reinvigorating the value proposition of ETL by leveraging the power of Hadoop to scale core processing of big data.”

      “Based on the evidence I have gathered talking with customers and in-the-weeds big data consultants, claims that Hadoop, and some non-Hadoop big data solutions, eliminate the need for ETL are patently false,” wrote analyst Evan Quinn in a post on the Enterprise Strategy Group (ESG) blog. “Nothing solves data prep and understanding challenges like ETL. ETL forces the data analyst to dig into the details of all the raw data, and conceptualize what a perfect data set for analytics would look like—and this exercise also helps the data analyst determine the analytical possibilities. … Thus, it should also come as no surprise that ETL has thus far proven to be one of the most popular applications of Hadoop, and, if anything, ESG sees Hadoop-based ETL continuing to grow its fan base.”

      Moreover, Quinn added, “Syncsort DMX-h ETL Edition will help Hadoopists take a big data step forward in terms of ETL ease of development and performance.”

      “Cloudera sees ETL as one of the top use cases for Hadoop—it is essential to our mission of maximizing the value of big data,” Amr Awadallah, chief technology officer at Cloudera, said in a statement. “We see Syncsort’s new DMX-h offerings enabling our mutual customers with critical data integration and ETL capabilities which simplify ETL deployments while efficiently processing data natively on Hadoop. The CDH 4.2 release includes Syncsort’s contribution to Apache Hadoop making the sort phase pluggable, enabling DMX-h, and broadening use cases on Hadoop.”

      The new DMX-h solutions take advantage of Syncsort’s recent contribution to Apache Hadoop, which provides a unique level of native integration to deliver best-in-class data integration capabilities and Sort acceleration for Apache Hadoop distributions.

      Highlights of the DMX-h ETL include an ETL engine that runs natively within MapReduce, maximizing node performance. It also provides Hadoop ETL without coding. Developers can leverage an easy-to-use Windows GUI and deploy seamlessly into Hadoop. In addition, it provides “use case accelerators,” which essentially is a library of pre-built templates, that help developers fast-track Hadoop ETL implementations, and it extends access to and delivery of all data, including from the mainframe.

      Recent Syncsort benchmarks show significant Hadoop performance and resource efficiency improvements when using DMX-h. The results show very predictable and sustainable throughput even as data volumes grow. Using the TeraSort benchmark, DMX-h Sort Edition achieved a sustainable throughput of more than 100MB per second per node, delivering upwards of two times higher throughput per node­ than Hadoop’s native sort at 45MB per second per node.

      Syncsort’s Hadoop ETL Solutions Provide Simplified Data Integration

      Similarly, DMX-h ETL Edition achieved sustainable throughput in excess of 255MB per second per node for up to 2.5 times faster performance than Pig when aggregating 2TB of Web log data. In both cases, tests were run for data volumes ranging from 500GB to 2TB of data. While alternatives such as Hadoop’s native sort and Pig reach a saturation point—where throughput starts to decline—at around 500GB of data, DMX-h delivered sustainable and predictable performance from 500GB to 2TB, Syncsort said. This represents major implications organizations as they can more efficiently size their Hadoop infrastructure, minimize uncertainty and achieve a more predictable cost structure as their big data becomes even bigger.

      “Hadoop is lowering the cost structure of processing data at scale, but deploying Hadoop at the enterprise level is not free, and significant hardware and IT productivity costs can damage ROI,” ESG’s Quinn said in a statement. “Syncsort’s Spring ’13 release provides unique capabilities in Hadoop to help maximize savings, delivering best-in-class ETL technology at a price point that is highly disruptive for the data integration market, and more consistent with the cost structure of open-source solutions.”

      Meanwhile, TagMan has a marketing data platform providing a software-as-a-service (SaaS) solution to help e-commerce ventures manage the tracking of their marketing campaigns to help them get the full picture of their marketing effectiveness. They facilitate visibility and reduced maintenance by managing vendor marketing tags using a single container tag to get all the information the advertiser wants to track, and tying together all the collected marketing data with other data to provide insights into the effectiveness of different campaigns in the full path to conversion.

      The TagMan data management production environment is currently a hybrid of an in-house developed data collection system and a traditional SQL database reporting system. However, TagMan is looking to Hadoop, which they have implemented in parallel, to leverage its horizontal scalability to be able to add nodes, and give them the necessary flexibility to add new data points they want to analyze.

      Ultimately, it allows them to handle more data more easily and efficiently when collecting massive amounts of data in real time. This enables them to create actionable intelligence on increasing big data and enable their clients to make minute-by-minute decisions on marketing decisions, such as real-time bidding and search optimization. TagMan sees Syncsort’s DMX-h ETL Edition as a fit with their Hadoop plans because the toolset makes it easier to anticipate and handle the required MapReduce processing—data collection and distribution. Company officials said that when they know how they want to slice the data, Syncsort can make it easy to do.

      “In tag management, we facilitate a huge number of interactions between marketers and their vendors, and as a result, we are able to see the complex journey a consumer takes prior to making a purchase,” said Ave Wrigley, CTO of TagMan, in a statement. “This involves a huge amount of data processing. To be competitive, we must convert the high volume of ‘path-to-purchase’ data captured by our platform into actionable intelligence that drives decisions by both marketers and their vendors. What’s compelling about Syncsort’s latest DMX product deliveries is the unique approach to replacing older code-driven approaches with a streamlined, GUI-driven way to collect, cleanse and distribute information inside and outside Hadoop, saving time and resources and giving us maximum flexibility in preparing big data for business analytics and data visualization.”

      Users looking to leverage DMX-h ETL can download a free test drive that contains everything they require without the need to set up their own Hadoop cluster. It includes a Linux Virtual Machine with Cloudera CDH 4.2 and DMX-h ETL Edition preinstalled, along with use case accelerators and sample data.

      Darryl K. Taft
      Darryl K. Taft covers the development tools and developer-related issues beat from his office in Baltimore. He has more than 10 years of experience in the business and is always looking for the next scoop. Taft is a member of the Association for Computing Machinery (ACM) and was named 'one of the most active middleware reporters in the world' by The Middleware Co. He also has his own card in the 'Who's Who in Enterprise Java' deck.
      Get the Free Newsletter!
      Subscribe to Daily Tech Insider for top news, trends & analysis
      This email address is invalid.
      Get the Free Newsletter!
      Subscribe to Daily Tech Insider for top news, trends & analysis
      This email address is invalid.

      MOST POPULAR ARTICLES

      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Applications

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      Cloud

      IGEL CEO Jed Ayres on Edge and...

      James Maguire - June 14, 2022 0
      I spoke with Jed Ayres, CEO of IGEL, about the endpoint sector, and an open source OS for the cloud; we also spoke about...
      Read more
      IT Management

      Intuit’s Nhung Ho on AI for the...

      James Maguire - May 13, 2022 0
      I spoke with Nhung Ho, Vice President of AI at Intuit, about adoption of AI in the small and medium-sized business market, and how...
      Read more
      Applications

      Kyndryl’s Nicolas Sekkaki on Handling AI and...

      James Maguire - November 9, 2022 0
      I spoke with Nicolas Sekkaki, Group Practice Leader for Applications, Data and AI at Kyndryl, about how companies can boost both their AI and...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2022 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×