All data is created in one place and moved–sometimes very often–from one storehouse to another. ETL (extract, transform and load) and data integration have always been among the thorniest problems in all of IT to solve efficiently. This software aids in copying or moving data from one database or repository to another and ensuring that the data is formatted correctly for the task at hand. ETL enables putting data to work and maximizing its value.
Generally, the ETL process has worked well, and it has been improved over time. Today, ETL, along with its brother Extract, Load, Transform (ELT), is used within increasingly complex data frameworks, including edge computing, the internet of things, connected supply chains, cloud environments and others. As enterprises move to more advanced business intelligence and data analytics—including systems that rely on machine learning and artificial intelligence (AI), ETL functionality is crucial.
Deploying the right ETL or data integration tool is very important to the success of an IT system. It can speed data processing, provide new ways to link and use data, and trim costs and time related to manual data management processes.
Here are some of the top ETL/data integration vendors in terms of market share, along with a look at their product and how it fits into the overall ETL marketplace. eWEEK used several industry sources to assemble this list. These include: eWEEK reporting, Technology Advice, G2 Crowd, IT Central Station, Gartner Peer Insights, TrustRadius and Crunchbase.
Amazon Web Services (AWS)
Value Proposition for Buyers: AWS, which owns a whopping one-third of the global web services business, remains the undisputed heavyweight of cloud computing service providers. Thus it’s not a surprise that it also offers a slate of products that connect legacy systems already using ETL to the cloud. These services include: AWS Import/Export Snowball, which offers petabyte-scale data transport; AWS Glue, a dedicated managed ETL service; AWS Database Migration Service, which is designed to move entire databases; and AWS Data Pipeline, which transports data across AWS computing environments along with on-premises systems.
- AWS services offer numerous capabilities, robust data management consoles and features that line-of-business employees can use to build analytics capabilities from various AWS platforms and engines. These include Redshift, S3 (Simple Storage Service) and virtual private cloud (Amazon VPC) as well as from legacy mainframes and other systems. The latter include tools from Alteryx, Informatica and Matillion. AWS also offers ELT, which pushes transformation into the database.
- AWS offers a broad menu of tools designed to address data management challenges involving the cloud. These range from straightforward ETL to more software that aids in moving massive amounts of data in an efficient and cost-effective way. AWS places a heavy focus on data integrity and data security.
Redwood City, Calif.
Value Proposition for Buyers: Informatica has long established itself as a leader in the data integration business, consistently ranking among the top vendors for data management and ETL. Its platform supports virtually all forms of data migration and transformation, including with AWS, Azure and other leading platforms and tools. It delivers a high level of automation and data validation across development, testing and production environments. The vendor has earned a Gartner Customers’ Choice award for a few years in succession.
- Informatica supports multi-cloud, on-premises and hybrid data integration in real time and batch modes. In addition, Informatica supports all major data formats and structures through native connectors. This includes industry specific formats such as SWIFT, HL7 and EDI X12.
- Informatica PowerCenter supports data management and integration across its lifecycle. It includes strong support for security and regulatory requirements. This includes non-relational data.
- The platform also supports grid computing, distributed processing, high availability, dynamic partitioning, pushdown optimization and adaptive load balancing. This produces a highly scalable and stable environment.
Pentaho (Hitachi Vantara Corp.)
Santa Clara, Calif.
Value Proposition for Buyers: Pentaho Data Integration (PDI) is well-respected for strong and reliable functionality delivered in a no-code process, which is the wave of the future. The tool handles data ingestion, blending, cleansing and preparation within a visual drag-and-drop environment. Pentaho works with all data types and formats and includes a powerful metadata injection feature that manages enterprise data at scale. It also includes a large library of pre-built components and delivers powerful orchestration capabilities that aid in coordinating and combining data.
- Pentaho addresses big data integration with a zero-coding approach. The platform aims to eliminate manual programming and scripting. It also allows users to switch between execution engines, such as Apache Spark and Pentaho, and it supports Hadoop distributions, Spark and objects stored in NoSQL. This allows the platform to perform real-time data ingestion and tap IoT protocols.
- PDI includes pre-built templates and it supports spot checks while data is in-flight, which aids in validation. It also delivers powerful orchestration capabilities along with notifications and alerts, and it includes an enterprise scheduler that coordinates workflows. In addition, the application ingests nearly any relational database, open source database, and file format. It connects to major business applications such as Salesforce and Google Analytics.
Prague, Czech Republic
Value Proposition for Buyers: Devart’s ETL toolset, Skyvia, is a software-as-a-service data platform that also uses a no-code wizard-based integration approach. It’s designed for use with no special knowledge of ETL and data integration. The graphical interface includes an intuitive set of wizards, templates and editors that pull data into a cloud, where data manipulation takes place. The platform provides strong mapping tools and features along with powerful automation for bi-directional synchronization.
- Skyvia builds reports and dashboards from almost any format, including SQL, CSV, FTP, SFTP, SQL Azure, Amazon RDS, Amazon S3, Dropbox, Box, Stripe, Oracle, Magento, G Suite, Google Drive, Dynamics CRM and Salesforce, to name a few. It also provides strong data export features, including strong filtering, the ability to export related object data and export scheduling.
- The bi-directional synching capability means that all data handled by Skyvia is available for use in real-time. What’s more, the platform preserves source data relationships in the target so that it can import data without creating duplicates. This makes it ideal for use among different groups within an enterprise.
Value Proposition for Buyers: Fivetran, one of the newer-generation data integrators, focuses on complete data replication within a no-coding and zero-maintenance framework. It offers automated data connectors that work with virtually all major applications, database formats and file types. The vendor’s ELT approach includes strong security and regulatory compliance tools. The Fivetran platform connects various sources of data to a central data warehouse in order to provide a holistic view of an organization.
- Fivetran delivers a straightforward and easy to use interface. The application’s use of a centralized data warehouse can simplify data management by automating processes and allowing enterprises to focus on BI and analytics tasks. Fivetran receives high marks for its willingness to work with customers and provide service and support.
- Fivetran features a robust and extensive set of connectors for virtually every application or data format. These include platforms as diverse as Salesforce, Oracle, Zendesk, Shopify, Hubspot, Stripe, Zero, Marketo, Mailchimp, Github, Workdays and FTP. The vendor supports quick and easy setup with maintenance-free data pipelines.
Value Proposition for Buyers: Microsoft isn’t the world’s most well-known data integrator, but it has well-respected tools in this category–both for the cloud and traditional data structures residing in SQL Server and database. Azure Data Factory is a hybrid data integration service that operates in a no-code environment. It extracts data from heterogeneous data sources and transforms them into cloud-scale repositories. The platform offers strong data mapping capabilities and includes tools for connecting the data to virtually any BI or analytics tool. SQL Integrated Services (SSIS) uses a drag-and-drop interface and strong data transformation capabilities to import data and integrate it with numerous software tools and platforms, including Salesforce.
- Azure Data Factory extracts data from numerous data sources, including SSIS. It offers connectors to more than 80 external data sources (including AWS, Cassandra, DB2, and numerous Azure repositories). Data Factory accommodates both cloud and on-premises data while delivering enterprise-grade security. The platform supports both codeless UI and the ability to write custom code.
- SSIS operates in a graphic environment and tackles enterprise grade data extraction, data cleansing and data transformation tasks. It offers import/export wizards to simplify data movement and it includes built-in scripting. The platform features a Services Catalog database that makes it easy to store, run and manage packages. In addition, SSIS can automate the maintenance of a SQL Server database. The platform received a Gartner Customers’ Choice 2018 award for Data Integration Tools.
Redwood City, Calif.
Value Proposition for Buyers: Oracle’s plethora of databases in numerous vertical segments positions the vendor as a natural choice for many organizations. Oracle Data Integrator (ODI) offers a graphical interface that enables users to build and manage data integration in the cloud. It is designed for larger enterprises with significant data migration needs. ODI supports a declarative design approach and includes automation tools. An ELT architecture eliminates the need for an ETL server, something that can simplify tasks and reduce costs.
- Oracle ODI is designed to serve as a comprehensive data integration platform that addresses the gamut of an organization’s data management needs. It works with major databases such as IBM DB2, Teradata, Sybase, Netezza, and Exadata as well as open source Hadoop. ODI taps existing RDBMS capabilities to integrate with other Oracle products for processing and transforming data.
- ODI is designed to reduce data movement in the cloud. It achieves this capability partly by tackling ELT and ETL directly where the data resides instead of making copies of data to remote locations. It also aims to eliminate hand coding through robust mapping capabilities.
Value Proposition for Buyers: SAP’s BusinessObjects Data Integrator handles large-scale data migrations, integrations and ETL. It takes aim at the challenges of moving large volumes of data between on-premises and legacy systems and the cloud. The software offers a graphical interface, powerful connectors and tools to support extreme extraction, transformation, and load (ETL) scalability. All of this delivers impressive flexibility and scalability through prebuilt data models, transformation logic, and data flows.
- BusinessObjects Data Integrator is built into SAPs Rapid Marts, which offer powerful ETL features optimized for reporting and end-user query and analysis. The platform can extract data from numerous enterprise systems, including SAP R/3, Siebel, Oracle, PeopleSoft, and J.D. Edwards applications.
- Data Integrator Designer offers a single tool for performing all tasks related to building, testing, and managing an ETL job. This includes: managing projects; profiling data; creating ETL jobs; cleansing, validating, and auditing data; setting parallel job execution; building workflows; and testing, debugging, and monitoring ETL jobs.
Value Proposition for Buyers: SAS has long been a major player in the world of BI and analytics, but it also offers products designed to handle virtually any data-related task. Its Data Integration Studio serves as a premier ETL product for linking data within SAS applications and beyond. The visual design tool can pull data from almost any source and, using powerful tools and logic, integrate it with analytics software. It delivers powerful and easy-to use capabilities designed for multi-user environments.
- SAS Data Integration Studio migrates, synchronizes, and replicates data among different operational systems and data sources. It alters, reformats, and consolidates data as required. Real-time data quality integration cleanses data as it is being moved, replicated, or synchronized. Users can build and apply reusable business rules.
- DIS lets users query and use data across multiple systems without the physical movement of source data. SAS Data Integration provides virtual access to database structures, ERP applications, legacy files, text, XML, message queues, and many other sources. This allows users join data across virtual data sources for real-time access and analysis. The resulting semantic business metadata layer reduces data complexity.
Los Altos, Calif.
Value Proposition for Buyers: Talend has a strong reputation among data management tool providers. The company offers three primary products aimed at ETL and related tasks: Talend Enterprise Data Integration, Talend Platform for Big Data Integration and Talend Open Studio for Data Integration. All three products landed on Gartner’s Customers’ Choice 2018 list. The vendor’s products have a reputation for speed and performance, flexibility and scalability, and ease of use.
- Transforming, moving and synchronizing data across heterogeneous sources and targets is at the center of Talend’s product offerings. The vendor offers highly flexible tools that work with cloud services such as AWS, Azure and Google as well as enterprise apps like Salesforce, Dropbox and Box—using ETL, ELT, batch and real-time processing.
- Talend offers a robust array of features within a graphical interface. This includes team collaboration features, continuous integration and delivery, visual mapping, data governance and security features, including fraud pattern detection and advanced matching and statistics analysis.
- Users like the company’s clear vision, roadmap and communities that operate within an open source framework. They also praise powerful capabilities at a lower cost than competitors.
San Francisco, Calif.
Value Proposition for Buyers: Xplenty provides a cloud data deliver platform that integrates numerous data stores, applications and other data sources. In most regions, the SaaS ETL platform can run on AWS, Google Cloud or the vendor’s own public or private cloud. The vendor is known for delivering a highly flexible, scalable and secure platform for managing nearly any type of data workload. It offers a broad set of APIs.
- Users give the vendor high marks for ease of use, flexibility and features. They also praise Xplenty for its service and support.
- The Xplenty platform uses native connectors to support more than 100 data stores and SaaS applications, including Facebook, Salesforce, AWS, Google Cloud, Microsoft Azure, Magento and Slack.
- Xplenty uses a package designer to implement a broad array of data integration use cases. The graphical point-and-click interface allows users to manage data without coding. The platform executes packages directly from the user interface or from an API. This approach simplifies automation, scheduling, job monitoring, status reports and other orchestration information.
Go here to find other eWEEK Top Vendors articles.