Are Hyperscalers Slowly Killing the Data & Analytics Market? - Data Integration
Analyzing Gartners Magic Quadrant for Data Integration Tools
The Market 2020 - 2024
In every year Gartner make assumptions how current trends will influence data integration. In the last years these where as follows:
2020 - Multi-cloud, augmented data management & graph processing
2021 - Machine Learning & AI-enabled automation
2022 - Data Fabric & AI-augmented data management
2023 - Data Fabric & AI-augmented data management
2024 - AI assistent & AI-enhanced workflows
The Dominators
Informatica, IBM, Oracle and SAP are the long term leaders of the Gartner Magic Quadrant for Data Integration Tools. This is a sign that while these capabilities are very important, the demand and traditional on-prem-oriented ETL-capabilities are strong.
I have seen all of this vendors going into the cloud with there offerings and bridging the on-prem / cloud integration need. So maybe they will also stay for long time.
Informatica is and stays the no. 1. But what we can see - the hyperscalers are coming…
Fig. 1: Movements in the Gartner MQ for Data Integration Tools 2023 to 2024
The Rise of the Hyperscalers
As in other categories Microsoft is an early leader within the hyperscalers but can be explained due to the long history on-premise in data management. Been a challengers in the years before 2020 Microsoft traveled into the leaders quadrant 2021 and shows today a very strong leader position. The tools helping into this position here are Azure Data Factory (ADF), SQL Server Integration Services (SSIS) and Power Query. Microsoft Fabric is mentioned and will possibly drive this position in the coming years. 94% of the customers are willing to recommend Microsoft for data integration.
Google was already honorable mentioned in 2021 and 2022 with Cloud Data Fusion as ETL/ELT service, Datastream as data replication service Data Migration for bulk-/batch-based data migration. Dataprep by Trifacta (now Alteryx) was supported for data preparation. Pub/Sub supports message-oriented data movement and Cloud Data Flow for IoT and streaming data. In 2022 they were mentioned again, adding Cloud Composer for data orchestration and Dataplex for central management and governance of data across all GCP offerings. In 2023 Google was included as Challenger but near at the boarder of the Leaders quadrant. In 2024 they jumped over and are now a Leader for Data Integration Tools. Google is strong for data engineering and the data fabric use case and for how easy to use they are compared to other solutions. On the other side is the focus still on the Google ecosystem.
AWS was already honorable mentioned 2021 for AWS Glue as serverless data integration service supporting different experiences with AWS Glue Studio for ETL developers and AWS Glue Data Brew for citizen data scientists. Furthermore there is AWS Database Migration Service and AppFlow for data ingestion and AWS Athena for data virtualization. 70% of the customers are willing to recommend AWS for data integration while the solutions show a stronger footprint in smaler companies and North America compared to Microsoft. As Google, AWS jumped into the Leaders Quadrant in 2024 with AWS Glue as main offering but have also Amazon Managed Workflow for Apache Airflow (MWAA), Amazon Kinesis, Amazon Managed Streaming for Apache Kafka (MSK), Amazon Managed Service for Apache Flink, Amazon EMR and Amazon Athena in the portfolio who contributs to the data integration tools. Additionally Zero-ETL is coming at AWS. AWS shows a strong support for multiple personas and a high integration into the AWS services.
SAS - a long time name in data & analytics went from Leaders Quadrant (2020) via Challenger (2021/2022) just fall out of the magic quadant. What happend here? I have seen SAS at least since 2018 in the leaders quadrant propably longer. Due to Gartner, the positioning as independent data integration vendor was a challenge with the SAS Viya product strategy and eventualy SAS vanished here in 2023.
SAP - a long time leader (at least since 2008 - see here) is still a leader but based on Werner Daehn’s recent article and similar to SAS we will possibly see a decline in the next years as for these vendors the integration into their own portfolio is possibly more important than to have a specific leading market offering in this area.
This shows us possibly just different strategies. Vendors like SAS and SAP bet on consolidation of the portfolio and integrated functionality while the hyperscalers grow out of their own ecosystem and bet on domination of all fields.
What does this Mean for Data Integration
Currently we see 10 Leaders in Gartners Magic quadrant and overall 20 vendors observed here. The market is diverse and the maturity is high, but has to follow the latest demands and developments of the data management market. We also see newer, strong vendors like Fivetran or CData in the last years rising. Today the Hyperscalers work well together with other vendors as it is very important to get data into their platforms. The hyperscaler offerings are growing and customers have a tendency to focus on best-of-suite and reducing the number of vendors to save money. So we can assume the momentum for the hyperscalers will stay high in the next years.
What are your thoughts on the current development in the Data Integration market? What effects do you expect from the current momentum? A new wave of consolidations (like with Qlik/Talend)? Disruption through AI? New mega vendors rising offering strong end-to-end data stacks? Even new cloud providers challenging the (new) domination of AWS, Azure and Google Cloud?
The analysis journey of the current D&A market based on Gartners MQ goes on:
Part 2: Data Integration Tools (here)
Part 3: Data Science and Machine Learning Platforms
Part 4: Cloud AI Developer Services
Part 5: Cloud Database Management Systems
All hyperscalers have started by offering horizontal services like IaaS, PaaS, that can be used across all industries. They have been growing at double digits per year for the last decade or so.
Even while making most revenue with comparatively "simple" services the major cloud providers have started offering vertical (specific or sector-specific offerings) as well for two reasons:
- to increase their growth potential
- to convince customers selecting "their" hyperscaler on a strategic basis, as they seem more or less on par for basic services
Seen from this perspective it seems natural that the hyperscalers are eating into market segments formerly occupied by specialized companies, with their own offerings or by partnering with someone else (also OK from the hyperscaler's point of view, as long as data, network and compute are hosted in the respective cloud).
So expect more of this to happen!