Organizations face steep obstacles related to finding answers amid growing mountains of data. Somewhere between the promise of broader and deeper insights and far more powerful tools for delivering answers lies ETL (extract, transform and load) software. It aids in copying or moving data from one source or repository to another and ensuring that the data is formatted correctly for the task at hand. Simply put, ETL makes it possible to put data to work and maximize its value.
ETL isn’t a new concept. As early as the 1970s, the technology began to make its mark. At that time, it was used mostly to cleanse data and transfer it from database to database or slot it into a data mart or a data warehouse. Today, ETL, along with its cousin Extract, Load, Transform (ELT), is used within increasingly complex data frameworks, including the internet of things (IoT), connected supply chains, cloud environments and more. As organizations move to more advanced business intelligence and data analytics—including systems that rely on machine learning and artificial intelligence (AI), ETL is crucial.
As businesses look to create more fluid, flexible and agile data frameworks, selecting the right ETL or data integration tool is critical. It can speed data processing, provide new ways to link and use data, and trim costs and time related to manual data management processes. Here are 10 of the top ETL vendors along with a look at their product and how it fits into the overall ETL marketplace. eWeek used several industry sources to assemble this vendor list. These sources include: G2 Crowd, IT Central Station, Gartner Peer Insights, TrustRadius and vendor websites.
Amazon Web Services (AWS)
Value Proposition for Buyers: AWS is the undisputed heavyweight of cloud computing service providers. So, it’s no surprise that it offers an array of products that connect to legacy systems and the cloud. These include: AWS Import/Export Snowball, which offers petabyte-scale data transport; AWS Glue, a dedicated managed ETL service; AWS Database Migration Service, which is designed to move entire databases; and AWS Data Pipeline, which transports data across AWS compute environments along with on-premises systems.
- AWS offers a broad array of tools designed to address data management challenges involving the cloud. These range from straightforward ETL to more software that aids in moving massive amounts of data in an efficient and cost-effective way. AWS places a heavy focus on data integrity and data security.
- The vendor’s products offer powerful capabilities, robust data management consoles and features that non-data scientists can use to build analytics capabilities from various AWS platforms and engines, including Redshift, S3 and virtual private cloud (Amazon VPC) as well as from legacy mainframes and other systems. The latter include tools from Alteryx, Informatica and Matillion. AWS also offers ELT, which pushes transformation into the database.
eWEEK Score: 4.2/5.0
See user reviews of AWS Import/Export Snowball
Prague, Czech Republic
Value Proposition for Buyers: Devart’s ETL product, Skyvia, is a SaaS data platform that uses a no-code wizard-based integration approach. It’s designed for use with no special knowledge of ETL and data integration. The graphical interface includes a robust set of wizards, templates and editors, which pull data into a cloud, where data manipulation takes place. The platform provides strong mapping tools and features along with powerful automation for bi-directional synchronization.
- The bi-directional synching capability means that all data handled by Skyvia is available for use in real-time. What’s more, the platform preserves source data relationships in the target so that it can import data without creating duplicates. This makes it ideal for use among different groups within an enterprise.
- Skyvia builds reports and dashboards from almost any format, including SQL, CSV, FTP, SFTP, SQL Azure, Amazon RDS, Amazon S3, Dropbox, Box, Stripe, Oracle, Magento, G Suite, Google Drive, Dynamics CRM and Salesforce, to name a few. It also provides strong data export features, including powerful filtering, the ability to export related object data and export scheduling.
eWEEK Score: 4.6/5.0
See user reviews of Devart
Value Proposition for Buyers: Fivetran focuses on complete data replication within a no coding and zero maintenance framework. It offers automated data connectors that work with virtually all major applications, database formats and file types. The vendor’s ELT approach includes strong security and regulatory compliance tools. The Fivetran platform connects various sources of data to a central data warehouse in order to provide a holistic view of an organization.
- Fivetran features a robust and extensive set of connectors for virtually every applications or data format. These include platforms as diverse as Salesforce, Oracle, Zendesk, Shopify, HubSpot, Stripe, Zero, Marketo, Mailchimp, Github, Workdays and FTP. The vendor supports quick and easy setup with maintenance-free data pipelines.
- The vendor delivers a straightforward and easy to use interface. The application’s use of a centralized data warehouse can simplify data management by automating processes and allowing organizations to focus on BI and analytics tasks. Fivetran receives high marks for its willingness to work with customers and provide service and support.
eWEEK Score: 4.3/5.0
Redwood City, Calif.
Value Proposition for Buyers: Informatica consistently ranks among the top for data management and ETL. Its platform supports virtually all forms of data migration and transformation, including with AWS, Azure and other leading platforms and tools. It delivers a high level of automation and data validation across development, testing and production environments. The vendor earned a Gartner Customers’ Choice 2018 distinction.
- The vendor supports multi-cloud, on-premises and hybrid data integration in real-time and batch modes. In addition, Informatica supports all major data formats and structures through native connectors. This includes industry specific formats such as SWIFT, HL7 and EDI X12.
- Informatica PowerCenter supports data management and integration across its lifecycle. It includes strong support for security and regulatory requirements. This includes non-relational data.
- The platform supports grid computing, distributed processing, high availability, dynamic partitioning, pushdown optimization and adaptive load balancing. This produces a highly scalable and stable environment.
eWEEK Score: 4.6/5.0
See user reviews of Informatica PowerCenter
Value Proposition for Buyers: Microsoft offers tools for both the cloud and traditional data structures residing in SQL. Azure Data Factory is a hybrid data integration service that operates in a no-code environment. It extracts data from heterogenous data sources and transforms them into cloud-scale repositories. The platform offers strong data mapping capabilities and includes tools for connecting the data to virtually any BI or analytics tool. SQL Integrated Services (SSIS) uses a drag-and-drop interface and strong data transformation capabilities to import data and integrate it with numerous software tools and platforms, including Salesforce.
- Azure Data Factory extracts data from numerous data sources, including SSIS. It offers connectors to more than 80 external data sources (including AWS, Cassandra, DB2, and numerous Azure repositories). Data Factory accommodates both cloud and on-premises data while delivering enterprise-grade security. The platform supports both codeless UI and the ability to write custom code.
- SSIS operates in a graphic environment and tackles enterprise grade data extraction, data cleansing and data transformation tasks. It offers import/export wizards to simplify data movement and it includes built-in scripting. The platform features a Services Catalog database that makes it easy to store, run and manage packages. In addition, SSIS can automate the maintenance of a SQL Server database. The platform received a Gartner Customers’ Choice 2018 award for Data Integration Tools.
eWEEK Score: 4.5/5.0
See user reviews of Azure Data Factory
Redwood City, Calif.
Value Proposition for Buyers: The widespread use of Oracle databases positions the vendor a natural choice for many organizations. Oracle Data Integrator (ODI) delivers a graphical interface that allows users to build and manage data integration in the cloud. It’s designed for larger enterprises with significant data migration needs. ODI supports a declarative design approach and includes automation tools. An ELT architecture eliminates the need for an ETL server, something that can simplify tasks and reduce costs.
- Oracle ODI is designed to serve as a comprehensive data integration platform that addresses the gamut of an organization’s data management needs. It works with major databases such as IBM DB2, Teradata, Sybase, Netezza, and Exadata as well as open source Hadoop. ODI taps existing RDBMS capabilities to integrate with other Oracle products for processing and transforming data.
- ODI is designed to reduce data movement in the cloud. It achieves this capability partly by tackling ELT and ETL directly where the data resides instead of making copies of data to remote locations. It also aims to eliminate hand coding through robust mapping capabilities.
eWEEK Score: 4.2/5.0
See user reviews of Oracle Data Integrator
Pentaho (Hitachi Vantara Corporation)
Santa Clara, Calif.
Value Proposition for Buyers: Powerful capabilities are at the center of Pentaho Data Integration (PDI). The software tool handles data ingestion, blending, cleansing and preparation within a visual drag-and-drop environment. Pentaho works with all data types and formats and includes a powerful metadata injection feature that manages enterprise data at scale. It also includes a large library of pre-built components and delivers powerful orchestration capabilities that aid in coordinating and combination data.
- Pentaho addresses big data integration with a zero-coding approach. The platform aims to eliminate manual programming and scripting. It also allows users to switch between execution engines, such as Apache Spark and Pentaho, and it supports Hadoop distributions, Spark and objects stored in NoSQL. This allows the platform perform real-time data ingestion and tap IoT protocols.
- PDI includes pre-built templates and it supports spot checks while data is in-flight, which aids in validation. It also delivers powerful orchestration capabilities along with notifications and alerts, and it includes an enterprise scheduler that coordinates workflows. In addition, the application ingests nearly any relational database, open source database, and file format. It connects to major business applications such as Salesforce and Google Analytics.
eWEEK Score: 4.1/5.0
See user reviews of Pentaho Data Integration
Value Proposition for Buyers: SAP’s BusinessObjects Data Integrator handles large-scale data migrations, integrations and ETL. It takes aim at the challenges of moving large volumes of data between on-premises and legacy systems and the cloud. The software offers a graphical interface, powerful connectors and tools to support extreme extraction, transformation, and load (ETL) scalability. All of this delivers impressive flexibility and scalability through prebuilt data models, transformation logic, and data flows.
- BusinessObjects Data Integrator is built into SAPs Rapid Marts, which offer powerful ETL features optimized for reporting and end-user query and analysis. The platform can extract data from numerous enterprise systems, including SAP R/3, Siebel, Oracle, PeopleSoft, and J.D. Edwards applications.
- Data Integrator Designer offers a single tool for performing all tasks related to building, testing, and managing an ETL job. This includes: managing projects; profiling data; creating ETL jobs; cleansing, validating, and auditing data; setting parallel job execution; building workflows; and testing, debugging, and monitoring ETL jobs.
eWEEK Score: 4.1/5.0
See user reviews of SAP Data Services
Value Proposition for Buyers: SAS is a heavyweight in the world of BI and analytics. The vendor offers products designed to tackle virtually any data related task. Its Data Integration Studio serves as a premier ETL product for linking data within SAS applications and beyond. The visual design tool can pull data from almost any source and, using powerful tools and logic, integrate it with analytics software. It delivers powerful and easy-to use capabilities designed for multi-user environments.
- SAS Data Integration Studio migrates, synchronizes, and replicates data among different operational systems and data sources. It alters, reformats, and consolidates data as required. Real-time data quality integration cleanses data as it is being moved, replicated, or synchronized. Users can build and apply reusable business rules.
- The product lets users query and use data across multiple systems without the physical movement of source data. SAS Data Integration provides virtual access to database structures, ERP applications, legacy files, text, XML, message queues, and many other sources. This allows users join data across virtual data sources for real-time access and analysis. The resulting semantic business metadata layer reduces data complexity.
eWEEK Score: 4.2/5.0
See user reviews of SAS Data Integration
Los Altos, Calif.
Value Proposition for Buyers: Talend has a strong reputation among data management tool providers. The company offers three primary products aimed at ETL and related tasks: Talend Enterprise Data Integration, Talend Platform for Big Data Integration and Talend Open Studio for Data Integration. All three products landed on Gartner’s Customers’ Choice 2018 list. The vendor’s products have a reputation for speed and performance, flexibility and scalability, and ease of use.
- Transforming, moving and synchronizing data across heterogeneous sources and targets is at the center of Talend’s product offerings. The vendor offers highly flexible tools that work with cloud services such as AWS, Azure and Google as well as enterprise apps like Salesforce, Dropbox and Box—using ETL, ELT, batch and real-time processing.
- Talend offers a robust array of features within a graphical interface. This includes team collaboration features, continuous integration and delivery, visual mapping, data governance and security features, including fraud pattern detection and advanced matching and statistics analysis.
- Users like the company’s clear vision, roadmap and communities that operate within an open source framework. They also praise powerful capabilities at a lower cost than competitors.
eWEEK Score: 4.3/5.0
See user reviews of Talend Open Studio
San Francisco, Calif.
Value Proposition for Buyers: Xplenty provides a cloud data deliver platform that integrates numerous data stores, applications and other data sources. In most regions, the SaaS ETL platform can run on AWS, Google Cloud or the vendor’s own public or private cloud. The vendor is known for delivering a highly flexible, scalable and secure platform for managing nearly any type of data workload. It offers a broad set of APIs.
- Xplenty uses a package designer to implement a broad array of data integration use cases. The graphical point-and-click interface allows users to manage data without coding. The platform executes packages directly from the user interface or from an API. This approach simplifies automation, scheduling, job monitoring, status reports and other orchestration information.
- The Xplenty platform uses native connectors to support more than 100 data stores and SaaS applications, including Facebook, Salesforce, AWS, Google Cloud, Microsoft Azure, Magento and Slack.
- Users give the vendor high marks for ease of use, flexibility and features. They also praise Xplenty for its service and support high.
eWEEK Score: 4.4/5.0