The data integration tool market has established a focus on transformational technologies and approaches demanded by data and analytics leaders. The presence of legacy, resilient systems and innovation all in the market together requires robust, consistent delivery of highly developed practices.
About Techcelerate Ventures
Tech Investment and Growth Advisory for Series A in the UK, operating in £150k to £5m investment market, working with #SaaS #FinTech #HealthTech #MarketPlaces and #PropTech companies.
Tag Cloud
LICENSED FOR
DISTRIBUTION
(https://www.gartner.com/home)
Magic Quadrant for Data Integration Tools
Published: 03 August 2017
ID: G00314940
Analyst(s): Mark A. Beyer, Eric Thoo, Mei Yang Selvage, Ehtisham Zaidi
Summary
The data integration tool market has established a focus on transformational technologies and
approaches demanded by data and analytics leaders. The presence of legacy, resilient systems
and innovation all in the market together requires robust, consistent delivery of highly developed
practices.
Market Definition/Description
The market for data integration tools includes vendors that offer software products to enable the
construction and implementation of data access and data delivery infrastructure for a variety of
data integration scenarios. These include:
Data acquisition for business intelligence (BI), analytics and data warehousing — Extracting
data from operational systems, transforming and merging that data, and delivering it to
integrated data structures for analytics purposes. The variety of data and context for analytics
is expanding as emergent environments — such as nonrelational and Hadoop distributions for
supporting discovery, predictive modeling, in-memory DBMSs, logical data warehouse
architectures and end-user capability to integrate data (as part of data preparation) —
increasingly become part of the information infrastructure. With the increased demand to
integrate machine data and support Internet of Things (IoT) and digital business ecosystem
needs for analytics, data integration challenges intensify.
Sourcing and delivery of application and master data in support of application data
management and master data management (MDM) — Enabling the connectivity and
integration of the data representing critical business entities such as customers, products and
employees. Data integration tools can be used to build the data access and synchronization
processes to support application data management and also MDM initiatives.
Data consistency between operational applications — Data integration tools provide the ability
to ensure database-level consistency across applications, both on an internal and an
interenterprise basis (for example, involving data structures for SaaS applications or cloud-
resident data sources), and in a bidirectional or unidirectional manner. The IoT is specifically
exerting influence and pressure here. Data consistency has become critical with new
functionality in DBMS offerings — hinting that the battle for data integration is heating up to
include the traditional data management vendors.
Interenterprise data sharing — Organizations are increasingly required to provide data to, and
receive data from, external trading partners (customers, suppliers, business partners and
others). Data integration tools are relevant for addressing these challenges, which often
consist of the same types of data access, transformation and movement components found in
other common use cases.
Populating and managing data in a data lake. The emerging concept of a "data lake" — where
data is continuously collected and stored in a semantically consistent approach similar to a
traditional DBMS, or with an expectation that data processing efforts will refine the semantics
of a nontraditional DBMS (such as nonrelational data stores) to support data usage. The need
for integrating nonrelational structures and distributing computing workloads to parallelized
processes (such as in Hadoop and alternative NoSQL repositories) elevates data integration
challenges. At the same time, it also provides opportunities to assist in the application of
schemas at data read time, if needed, and to deliver data to business users, processes or
applications, or to use data iteratively. In addition, the differing structure of IoT or machine data
is introducing new integration needs.
Data migration. Previously considered a data integration style in its own right, data migration is
more of a task that can be done with a variety of tools or techniques. The primary feature of
data migration is moving data to a new platform or to an update of an existing data
management platform. It can also include moving data from one application to a new
application or to an upgraded version of an application.
The usage of data integration tools may display characteristics that are not unique to just one of
these individual scenarios. Technologies in this market are required to execute many of the core
functions of data integration, which can apply to any of the above scenarios. Examples of the
resulting characteristics include:
Increasingly, data integration tools are expected to collect, audit and monitor information
regarding the deployed data integration services and processes in the organization. This
ranges from use cases for simple reporting and manual analysis to the inclusion of
recommendations and even automated performance optimization. While primarily focused on
management tasks, the ability to profile new data assets and recognize their similar nature and
use cases as compared to other data currently integrated is growing in importance. Small
devices that roam and attach to data portals will also become prevalent. The requirement for
metadata capabilities will become the center of all integration approaches.
Interoperating with application integration technology in a single solution architecture to; for
instance, expose extraction, transformation and loading (ETL) processes that extract data from
sources as a service to be provisioned via an enterprise service bus (ESB). Increasingly, there is
a demand for analyzing and integrating data during "business moments" when events demand
an in-process operational change based upon data-driven decisions.
Enabling data services as an architectural technique in a service-oriented architecture (SOA)
context. Rather than the use of data integration per se, this represents an emerging trend for
data integration capabilities to play a role and to be implemented within software-defined
architecture for application services.
Integrating a combination of data residing on-premises and in SaaS applications or other
cloud-based data stores and services, to fulfill requirements such as cloud service integration.
Organizations are also seeking the capability for pivoting between cloud and on-premises — in
enabling a hybrid integration platform (HIP).
Connecting to, and enabling the delivery of data to — and the access of data from — platforms
typically associated with big data initiatives such as Hadoop, nonrelational and cloud-based
data stores. These platforms provide opportunities for distributing data integration workloads
to external parallelized processes.
Magic Quadrant
Figure 1. Magic Quadrant for Data Integration Tools
Source: Gartner (August 2017)
Vendor Strengths and Cautions
Actian
Based in Palo Alto, California, U.S., Actian offers data integration capabilities via Actian
DataConnect and Actian DataCloud. Actian's customer base for data integration tools is
estimated to be approximately 7,000 organizations.
STRENGTHS
Market presence: By focusing on targeted aspects of the overall data integration market —
through messaging-style solutions, bulk data movement, and alignment to Actian's B2B
solutions — a relatively large market reach as a company provides Actian with leverage to
create opportunities for its data integration tools.
Performance and manageability: Good performance and throughput for integrating data and
centralized management of integration processes are attractive propositions for organizations
emphasizing these requirements. Additional functionality planned on Apache Spark Streaming
sets out to extend Actian's support for big data to include metadata analysis for the stream.
Synergistic product strategy: Provisioning of functionality through a portfolio of
complementary capabilities for integrated use is cited as a key value by Actian customers and
implementation partners. Actian plans to align key integration products in its portfolio into a
unified platform: including DataCloud, Business Xchange and DataConnect.
CAUTIONS
Upgrade complexity and metadata support: Reference customers for Actian expressed
difficulties with version upgrades, the technical complexity of migrating between major
releases, and the quality of documentation. They also cited metadata management and
modeling functionalities as a relative weakness. Actian's latest release of DataConnect, version
11, offers direct and automatic import of mappings and other artifacts from version 9 to
version 11, to ease upgrades from older versions.
Limited guidance and support for implementation: The availability of skilled implementers and
guidance for common and best practices are growing concerns among Actian's reference
customers, who desire readily accessible self-help resources for implementation approach and
issue resolution.
Low appeal to diversifying integration roles: Actian has its roots in technical offerings that are
aligned well to communities in the IT arena. This runs contrary to current market momentum
that tends strongly toward the needs of multiple personas and self-service integration options
for less-technical and nontechnical users. Actian is addressing this through a browser-based,
graphical user interface to support the needs of business roles to perform basic and intuitive
integration tasks.
Adeptia
Based in Chicago, Illinois, U.S., Adeptia offers the Adeptia Integration Suite (AIS) and Adeptia
Connect. Adeptia's customer base for this product set is estimated to be more than 550
organizations.
STRENGTHS
Delivers core competency within an ESB: Adeptia supports the core requirements of
bulk/batch data delivery and granular data capture and propagation with a combination of its
data integration capability, application integration, ESB, B2B integration and trading partner
management.
Attractive pricing and flexibility: Reference customers view Adeptia's products as being
attractively priced relative to its competitors; they also value its flexible subscription licensing
options. Adeptia's ability to promote the interoperation of data integration functionalities with
capabilities for ESB and business process management (BPM) is greatly appreciated
according to reference customers and Gartner inquiry discussions.
Performance, usability and integration platform as a service support: Adeptia Integration
Suite offers integration platform as a service (iPaaS) capabilities, which enable B2B integration
and interenterprise data sharing use cases. Its products support integration of on-premises
endpoints, cloud endpoints and a combination of the two for integration patterns that support
"pervasive integration." Reference customers also cite ease of use, good performance and
throughput as strengths, which are particularly relevant for enabling citizen integrators and
capitalizing on business moments.
CAUTIONS
Weakness supporting big data initiatives: Adeptia's current implementations and competitive
bids indicate extremely limited traction in support of big data initiatives, although connector
capability is available to plug in external components such Apache Spark, Hive and Pig. Since
data integration into big data stores (such as Hadoop and nonrelational) is increasingly being
emphasized in the market, pressures for enabling upcoming and popular use cases (enabling a
data lake, for example) will grow.
Narrow market functionality coverage: Reference customers appreciate Adeptia's support for
traditional use cases involving bulk/batch, operational and BPM scenarios. However, its
product roadmap does not include incorporating other comprehensive styles of data
integration (data virtualization, for example) into its AIS platform for exploiting the entire
breadth of data integration use cases, which poses a challenge in competitive situations.
Limited capability to integrate with other data management solutions: Reference customers
expressed a desire for better interoperability and integrated usage with related technologies for
data management and application integration (including data quality, data governance,
metadata management and MDM). Adeptia continues to address these concerns by actively
expanding its network of technology partners.
Attunity
Based in Burlington, Massachusetts, U.S., Attunity offers Attunity Replicate, Attunity Compose
and Attunity Visibility. Attunity's customer base for this product set is estimated to be
approximately 2,500 organizations globally.
STRENGTHS
Strength and stability of targeted functionality: Attunity offers strong data integration
functionality in the key areas of data replication and synchronization technology applied to
heterogeneous data types, with a historical strength in addressing mainframe data. Customers
favor Attunity's business longevity in supporting data consistency while also addressing the
modern requirements of cloud and big data scenarios. Attunity has also targeted cloud data
integration and is listed as a preferred data integration partner for both Amazon Web Services
and Microsoft Azure.
Time to value: Attunity's tooling supports data integration specialists as well as, increasingly,
less technically skilled personnel in order to make data available across applications and data
structures for BI and analytics. The role-based interface allows citizen integrators to integrate
data quickly for their purposes. Design and administrative tools are used by enterprise
architects and designers when more robust solutions are required. The result is a mix of time-
to-value of delivery that matches many different business needs.
Aligned with modern data integration approaches: Integration activities involving Apache
Kafka, Spark and Hadoop in cloud environments extend Attunity's established experience in
supporting analytics and data warehousing to meet the increasing challenges found in event-
driven data requirements and a mix of cloud and on-premises integration scenarios.
CAUTIONS
User experience focused on targeted data integration styles: Adoption of Attunity's product
set is predominantly for its change data capture (CDC)/replication capability. The continuing
shift of buyer demands toward comprehensive data delivery styles, and of integrated usage
between data integration activities with related technologies for extensive metadata support,
application integration, and information governance needs, poses challenges for Attunity in
competitive situations.
Specific administrative operations being addressed: Some reference customers have reported
specific issues with restart and recovery. Attunity has released operations management
capabilities for addressing these requirements and enhancements in operational control and
administration and is developing metadata and operations management to address these
specific enterprise demands.
Demand for documentation and low availability of skills: As deployments of Attunity increase
in complexity, reference customers have expressed their concern about the availability of
skilled resources (which are difficult to find). Improved documentation is also desired for
administrative operation and implementation guidance.
Cisco
Based in San Jose, California, U.S., Cisco offers the Cisco Information Server (CIS). Cisco's
customer base for this product set is estimated to be around 400 organizations.
STRENGTHS
Strong roadmap for IoT integration use case: Leveraging a unique opportunity that has arisen
from its leadership in network technology, Cisco has concentrated its data integration
investments to support the IoT use case, specifically on operational technology (OT)/IT
convergence. As a result, Cisco is introducing new and innovative capabilities for IoT. Currently,
this is more aspirational than actual, but combines Cisco's historic strength in networks with
data virtualization capabilities that have the potential to be "cutting edge."
Leverages brand and market presence on global scale: Cisco's brand is well-known worldwide
for network capabilities. By adding data integration along its communication backbone, it will
be able to expand quickly into developing markets where networks and data integration can
proceed together (for example, in Eastern Europe and Asia).
Established capability in data virtualization: CIS has been a leading brand in data virtualization
for more than 10 years. Reference customers praise the product's stability, broad connectors,
data catalog and optimization engine, as well as its ability when serving as an enterprise
semantic layer. It is important to note that Cisco is one of only two data-virtualization-focused
vendors present on the Magic Quadrant — even beating out some smaller, broad-based tools
for the spots.
CAUTIONS
Isolating on data virtualization may limit other data integration styles: With Cisco's tight focus
on IoT use cases, its innovations for other data virtualization use cases (such as application
data access and consumption) and other data integration styles (such as batch, data
replication and messaging) have taken a back seat. CIS could eventually become absorbed by
Cisco's IoT platform and therefore cease to remain a truly independent data virtualization
product.
High price and complicated license models: Because data virtualization needs to be
complemented by other data integration styles, end-user organizations often have limited
budgets for data virtualization tools. Increasing numbers of Gartner inquiries have reported
that Cisco's pricing and complicated licensing models have prevented them from adopting CIS
or have created procurement challenges.
Lack of skilled resources and inadequate knowledge transfer: Cisco reference customers as
well as Gartner inquiry clients report challenges in finding qualified developers in the market
and cite inadequate knowledge transfer after the product "went live." Cisco is addressing these
concerns in the product roadmap through establishing a knowledge base, communities,
enhanced self-service and expanding its implementation partner base.
Denodo
Based in Palo Alto, California, U.S., Denodo offers the Denodo Platform as its data virtualization
offering. Denodo's customer base for this product set is estimated to be around 380
organizations.
STRENGTHS
Strong momentum, growth and mind share within the data virtualization submarket: The
Denodo Platform is mentioned in almost all competitive situations involving data virtualization
under Gartner's contract review service — a significant increase during the past year that is
highlighted by expanding partnerships with technology and service providers. Denodo has
added offerings on the AWS marketplace for subscription and pay-as-you-go-based licensing
options. Denodo is gaining momentum in 2017, and is one of only two data-virtualization-
focused vendors present on the Magic Quadrant.
Broad connectivity support and streaming support: All available connectors are included
within the Denodo Platform's cost. Denodo has also increased its connectivity support for
streaming data (with support for Kafka message queues, Apache Storm, Spark Streaming, and
so on); and cloud services on the Amazon Web Services (AWS) and Azure marketplaces; as
well as partnerships with database platform as a service (dbPaaS) vendors such as AWS
Redshift and Snowflake. Denodo can also interoperate with Docker technology.
Mitigation for traditional virtualization issues: Denodo Platform is a mature data virtualization
offering that incorporates dynamic query optimization as a key value point. This capability
includes support for cost-based optimization specifically for high data volume and complexity;
cloud environments; predicate optimization techniques, including full and partial aggregation
pushdown below joins; and partitioned unions for logical data warehouse and data lake
requirements. It has also added an in-memory data grid with Massively Parallel Processing
(MPP) architecture to its platform in supporting incremental caching of large datasets, reusing
complex transformations and persistence of federated data stores.
CAUTIONS
Challenged to support multiple data delivery styles: Denodo has limited support for ETL,
CDC/replication and messaging. Denodo is incorporating features for data quality and data
governance into its platform, but its mind share in these related markets continues to be lower
than that of other incumbent vendors with these capabilities.
Pricing issues and contract flexibility: Some of Denodo's existing customers (reference
customers, reference interviews and in Gartner events' one-on-one discussions) report the total
cost of ownership (TCO) and high price points (particularly for its enterprise unlimited licensing
option) as barriers to adoption. Reference customers also indicate that the average
maintenance and support paid on a yearly basis is high.
Various user experience issues due to growth: Denodo customers indicate good support and
responsiveness to customer queries; however, a small but significant number report their
desire for better documentation, training (particularly on new features), customer support
(largely driven by log reviews) and user experience. A recent release of new web-based
documentation and enhancements of the integrated development environment (IDE) are
among Denodo's focus on initiatives for customer experience.
IBM
Based in Armonk, New York, U.S., IBM offers the following data integration products: IBM
InfoSphere Information Server Enterprise Edition, IBM InfoSphere Information Server Enterprise
Hypervisor Edition, IBM InfoSphere Federation Server, IBM InfoSphere Data Replication, IBM Data
Integration for Enterprise, and IBM Data Integration for Hadoop, IBM Big Insights BigIntegrate,
IBM Streams and IBM Bluemix Data Connect (previously DataWorks). IBM's customer base for
this product set is estimated to be more than 10,700 organizations.
STRENGTHS
Innovation aligned with traditional and emerging market trends: IBM's stack includes all
integration styles for enterprise data integration projects requiring a mix of data granularity and
latencies. In 2017, the solution offers expanded capabilities for Avro, Parquet, Kafka, Hive,
Ambari, Kerberos and other open-source solutions. For cloud, IBM offers CDC for Hadoop via
its WebHDFS (that is, Hadoop Distributed File System) component, and Bluemix for hybrid
integration scenarios.
Global presence and mind share: IBM continues to successfully draw on its global presence
and extensive experience and mind share in data integration and management. It is frequently
mentioned by users of Gartner's inquiry service and often features in competitive evaluations
on our contract and RFP reviews. It continues to gain traction due to its significant ecosystem
of partners, value-added resellers, consultants and external service providers.
Comprehensive portfolio meets integration demands across diverse personas: IBM continues
to invest in data quality, data governance, MDM and application integration (through rapid
growth in iPaaS). Its investments in a design module that has separate interfaces for users
with varying levels of integration skills (for example, integration specialists, citizen integrators)
replaces the aging Information Analyzer workbench, along with Bluemix Data Connect (for self-
service data preparation), with Watson Analytics.
CAUTIONS
Confusion over integration across product set and product messaging: Reference customers
cited confusion about IBM's varied portfolio of offerings — a perennial issue. Difficulties in
understanding integrated deployments as more products are added (including integration of
IBM's data integration tools alongside other IBM products), are a challenge amid the growing
usage scenarios in data integration. They also expressed confusion over IBM's frequent
renaming of its solutions, which could lead to redundancy and shelfware.
User issues regarding installation, migrations and upgrades: Although IBM continues to
address migration complexity through in-place migrations and significant improvements to its
user experience, some reference customers still reported continuing upgrade difficulties with
some versions, indicating a need for IBM to further improve the user experience offered by its
data integration tools.
High costs for licensing models and perceived TCO: Gartner inquiry clients regularly cite high
TCO as one of primary reasons for overlooking IBM in competitive situations. This, coupled
with the complexity of its licensing options and confusion regarding new cost models, was
identified as a major inhibitor of adoption. While IBM's provision of varied licensing approaches
— such as processor value unit (PVU), node-, workgroup-, bundle-, subscription (monthly and
yearly)- and perpetual-based models — is intended to provide more choice, it has reportedly
also confused customers.
Informatica
Based in Redwood City, California, U.S., Informatica offers the following data integration
products: Informatica Platform (including PowerCenter, PowerExchange, Data Replication,
Advanced Data Transformation, Ultra Messaging, B2B Data Transformation, B2B Data Exchange,
Data Integration Hub), Informatica Data Services, Informatica Intelligent Cloud Services, Cloud
Integration Hub, Big Data Management, Big Data Integration Hub, Informatica Intelligent
Streaming, Informatica Intelligent Data Lake and Informatica Data Preparation. Informatica's
customer base for this product set is estimated to be more than 7,000 organizations.
STRENGTHS
Strong vision for product strategy across cloud and big data management: Informatica
continues to deliver on its vision of a unified integrated platform for all data delivery styles, for
a broad range of use cases. Strong interoperability and synergies between Informatica's data
integration tools and other data management technologies encourage usage as an enterprise
standard that links with data quality, MDM, metadata management, data governance, data hub,
data lake, big data analytics, data security and iPaaS technologies.
Roadmap aligned with emerging trends and multipersona demand: Informatica's Cloud-scale
AI-powered Real-time Engine (CLAIRE) technology for metadata-driven AI along with strong
integration with its Enterprise Information Catalog (its metadata management tool) delivers on
the emerging trends vision with an effective metadata management and data governance
approach for data in cloud, on-premises and big data environments. Informatica also offers
self-service data preparation for diverse personas that range from centralized IT to citizen
integrators.
Broad market presence, global reach and mind share: Informatica's mind share in the data
integration market is as the leading brand appearing in just over 70% of all contract reviews
and in the vast majority of competitive situations in our inquiry calls. It has a significant
presence in established as well as new or growing geographies for data integration, with a well-
established global network of more than 500 partners.
CAUTIONS
Poor perceptions of pricing: Informatica's data integration tools are historically perceived as
being high-cost, with hardware-based perpetual license models. Informatica is executing a shift
toward subscription pricing models, including pay-as-you-go and hourly pricing models, and
offering PowerCenter and Big Data Management on the AWS and Azure Marketplaces.
Reference customers still cite their concerns over Informatica's strategy to continue charging
separately for some targeted connectors.
Lack of clarity in product messaging and portfolio architecture: Gartner continues to receive
inquiry calls citing confusion about Informatica's overlapping product functionality and
features. Reference and inquiry customers seem confused and are buying similar products
with overlapping features and capabilities, which often leads to shelfware or redundancy.
Informatica hopes its rebranding and positioning of the company, redesign of its website, and
new sales and partner enablement programs will address some of these concerns.
Challenges for new market offerings and global strategy: Informatica now focusses
extensively on hybrid data integration, cloud data integration, big data management and
metadata management. It is also expanding its focus on industry-specific offerings, bundled
solutions and non-IT user personas. It's still early days, and Informatica would need a multiyear
vision and change management execution — complete with significant sales training — in order
to execute well on this strategy.
Information Builders
Based in New York City, New York, U.S., Information Builders offers the following data integration
products: the iWay Integration Suite (composed of iWay Service Manager and iWay DataMigrator)
and iWay Universal Adapter Suite. Information Builders' customer base for this product set is
estimated to be more than 840 organizations.
STRENGTHS
Positive customer experience: Information Builders' reference customers continue to report
overall satisfaction (in the top 25% for this Magic Quadrant) with the product and its balance of
features and functionality; the only caveat being the ease of use of the interface (described as
"moderate"). The broad connectivity of the overall platform and the strength of the technology
are enhanced by the customer experience — for issue resolution and best practices — from its
professional services; customers report a feeling of "partnership." Additionally, Information
Builders has worked with and promoted interactions in its user community to facilitate peer-to-
peer support.
Diverse data integration capability: In a continuation of last year's theme, Information Builders
is improving on the model for "making big data normal." With the capability to combine
bulk/batch with message-oriented data (for real-time) as well as with diverse assets
(commonly referred to as "unstructured" data), this vendor has demonstrated some very
complex data integration implementations during the past 12 months.
Alignment with essential data management and integration needs: The breadth of the product
portfolio and Information Builders' experience in deployments of various data integration styles
aligns it well with the contemporary trends of related markets (for example, its Omni-Gen
platform aligns data integration with the adjacent technologies of data quality and MDM).
CAUTIONS
Lack of appeal for diverse roles: Information Builders still appeals mainly to technical
communities and IT buyers, though the focus on self-service data preparation has begun to
gain some traction. That said, this vendor has maintained its position even though the entire
market has shifted down and to the left on the Magic Quadrant.
Skills hard to find and training materials and documentation need improvement: Recent
reference customer input indicates that last year's documentation weakness persists, and
specifically states that finding online materials is difficult. Customers also report that it is very
difficult to locate experienced resources, and recommend aligning your team early when
choosing Information Builders.
Inability to gain mind share: Gartner's client inquiries indicate that Information Builders is not
considered as frequently as its market-leading competitors. This relative lack of mainstream
recognition represents a disadvantage that needs addressing.
Microsoft
Based in Redmond, Washington, U.S., Microsoft offers data integration capabilities via SQL
Server Integration Services (SSIS), which is included in the SQL Server DBMS license. Microsoft
also includes data integration as part of the Azure Data Factory. Microsoft SQL Server
deployments are inclusive of SSIS for data integration (Microsoft does not report a specific
customer count for SSIS).
STRENGTHS
Productivity and time to value: SSIS supports connectivity to diverse data types and broad
deployment in Microsoft-centric environments. Wide use of SSIS by SQL Server customers has
resulted in widely available community support, training and third-party documentation on
implementation practices and approaches to problem resolution.
Synergies among data, applications, business roles and artificial intelligence: SSIS is often
used to put data into SQL Server, to enable analytics, data management and end-user data
manipulation using Microsoft's Office tools, particularly Excel. Using SSIS in conjunction with
Microsoft's BizTalk and Azure Data Factory platforms enables delivery of data from enterprise
business workflows and data preparation. Microsoft builds its data integration customer base
by embedding it into use cases for other Microsoft products (for example, data delivery to the
Cortana Intelligence Suite to establish synergy between data integration, cloud and cognitive
computing capability.
Brand awareness and market presence: Microsoft's size and global presence provide a huge
customer base and a distribution model that supports both direct and channel partner sales.
Adoptions continue, because Microsoft products are among the most familiar to implementers
when considering interfaces, development tools and functionality.
CAUTIONS
Integration lacking in the portfolio: Microsoft is a large company with somewhat isolated
product roadmaps and no significant indication that this will change in the near future.
Integrated implementation of Microsoft's offerings indicates difficulty for the integrated
deployment of an expanding range of offerings and functionality when discerning optimal ways
for manipulating and delivering data of interest alongside Azure Data Factory, data quality and
governance activities.
Evolution perceived as Microsoft-focused: There are concerns among customers about
Microsoft's roadmap becoming tightly linked to Azure- and Cortana-related platforms,
somewhat contradictory to its emphasis on the "any to any" heterogeneous needs of data
integration. This is nothing new to either Microsoft or its customers and most can proceed as
they have in the past.
Limited platform supported: The inability to deploy SSIS workloads in non-Windows
environments is a limitation for customers wanting to draw on the processing power of diverse
hardware and operating environments. Microsoft has release plans for SQL Server 2017 on
Linux.
Oracle
Based in Redwood Shores, California, U.S., Oracle offers the following data integration products:
Oracle Data Integrator (ODI), Oracle Data Integrator Cloud Service, Oracle GoldenGate, Oracle
GoldenGate Cloud Service, Oracle Data Service Integrator and Oracle Service Bus. Oracle's
customer base for this product set is estimated at more than 10,800 organizations.
STRENGTHS
Customer base directs product innovations: Oracle has invested in and rolled out several
focused upgrades to its data integration product portfolio during 2016. GoldenGate's "zero
footprint" technology means that it doesn't have to be installed on the source or target system.
Oracle has ensured tight integration between ODI and GoldenGate via Kafka for the delivery of
data in real-time opening deployment options to include Kappa or Lambda architectures for
stream analytics. ODI's ability to push down processing to Spark augurs well for big-data-
related use cases. Reference customers report that Oracle's knowledge modules are an
effective way to decouple business logic from ETL code.
Role and self-service support: Synergy of data integration alongside complementary
capabilities in Oracle Big Data Preparation (which is now also available as a cloud service)
adds support for natural-language processing and graph features, along with improved
integration with its metadata manager and machine-learning capabilities to empower non-IT
roles (particularly citizen integrators) to build, test and then make operational integration flows.
Global partner network eases implementation worries: Oracle uses its large partner network to
assist customer implementations in finding implementation service providers for Oracle
technologies in data integration. Along with this, Oracle has a diverse portfolio of solutions for
data integration and supporting data management technologies — including iPaaS, metadata
management, data quality, and data governance — which allow existing customers to expand
with Oracle.
CAUTIONS
Issues with pricing flexibility and perceived value: Calls with Gartner customers on our
contract review and inquiry services, along with ratings on our Peer Insights, indicate concerns
with the cost model for customers desiring more flexibility in license models and pricing
options — representing one of the biggest reasons for prospects choosing the competition
over Oracle on our contract reviews. Through a range of term-based, subscription-based, name-
based, metered-based pricing options offered (even hourly on GoldenGate and ODI), Oracle is
trying to address this concern using more flexible packaging.
Difficulties with migration and upgrades: Reference customers continue to cite issues with
migration to the newer versions of Oracle's data integration tools. Some existing customers
reported significant challenges with bugs in existing versions and upgrades impacted by these
bugs — causing Oracle's scores to be lower than average for this customer satisfaction
category.
Limited appeal to new users for new and extended offerings: Oracle appears in less than 30%
of data integration competitive situations for projects of inquiries from Gartner clients. Gartner
recognizes that Oracle does have capable solutions across the breadth of data integration
styles; however, its existing and new customers seem unaware of the maturity and relevance of
these solutions across the range of modern integration needs.
Pentaho
Pentaho is a Hitachi Group Company with its global headquarters in Orlando, Florida, U.S., and
worldwide sales based in San Francisco, California, U.S. Pentaho does not report customer
counts for specific products, but has more than 1,500 commercial customers.
STRENGTHS
Broadening use-case-agnostic integration solution: Pentaho Data Integration (PDI) provides
data integration across a broad spectrum of relational DBMSs, Java Database Connectivity
(JDBC)/Open Database Connectivity (ODBC) access, and cloud-based data management
solutions. During the past three and more years, Pentaho has positioned its data integration
tool as an agnostic solution that is increasingly capable of delivering against independent
targets and enterprise-class demands. PDI includes a large number of prebuilt data access and
preparation components, a rich GUI for data engineers, orchestration of integration
components and an integrated scheduler that can interoperate with enterprise system
schedulers.
Experience in cloud, on-premises and hybrid: Pentaho's customer reference base includes
examples of all three deployment models of data integration, including very large customers
across back-office, IoT and machine/sensor data solutions, as well as traditional data
integration demands. Loads to Amazon Redshift and integration with Amazon Elastic
MapReduce (EMR), and Cloudera, as well as embedded R, Python and Spark Machine Learning
library (MLlib) models in the integration stream, capitalize on deployment needs.
Market-awareness of open source and roles: PDI already works well within Apache Spark and
other distributed processing environments, and is addressing issues such as load balancing
with task isolation to enhance distributed processing operations. Pentaho leverages open-
source solutions (such as Kafka) to mix real-time integration with batch/bulk capability.
Pentaho's existing capability in BI has been added to the PDI capability that allows users to
visualize data integration results in-line and identify data quality problems before moving to
production deployments.
CAUTIONS
CAUTIONS
Market mind share is low: Organizations that are not familiar with PDI still consider Pentaho to
be more of an analytics vendor — which limits the interest expressed by the market. In
response, Pentaho has improved its marketing as a stand-alone data integration tool provider.
PDI pull-through is exhibited in the ability of Pentaho to sell a complete analytic platform to
customers after PDI is in place. The Hitachi brand is also showing pull-through, especially for
major global brands.
Development environment can be improved: Error handling in job execution and a need for
extensive developer feedback are reported by Pentaho's customer references. As an open-
source-based solution this is to be expected, in part, and the user documentation is reported as
being above expectations (for open source), which also helps.
Focus on big data detracts from other use cases: With good capabilities for Hadoop and other
big-data-related solutions, as well as satisfactory throughput in batch processes (Hadoop is
effectively also batch), it is important to keep the use cases aligned with the tools' capabilities.
The redistribution and isolation of workloads to improve performance exhibit some limitations,
which Pentaho plans to resolve in its next release.
SAP
Based in Walldorf, Germany, SAP offers the following data integration products: SAP Data
Services, SAP Replication Server, SAP Landscape Transformation Replication Server, SAP Remote
Data Sync, SAP Data Hub, SAP Hana platform, SAP Cloud Platform Integration and SAP Event
Stream Processor. SAP's customer base for this product set is estimated at more than 27,000
organizations.
STRENGTHS
Solution strategy leverages product breadth: SAP continues to deliver strong data integration
functionality in a broad range of use cases to its customers. A mix of granularity, latency, and
physical and virtualized data delivery supported alongside complementary offerings of iPaaS,
self-service data preparation, information governance and MDM, all combine to position SAP
data integration solutions for complex problems in the SAP ecosystems. Introduction of a "try
and buy" licensing approach enables organizations to incrementally gain familiarity with role-
based functionality prior to purchase.
Relevance for highly connected and distributed digital architecture: SAP's investment in
serverless and in-memory computing, cluster management, and data hub offering to support
data integration architecture extends distributed pushdown executions to capitalize on the
processing capacity of cloud DBMS, cloud storage and IoT infrastructures. SAP's roadmap in
this market includes rule-based autoscaling and self-adjusting optimization for data integration
workstreams, in order to capitalize on machine learning.
Synergy across data and application integration, governance and analytics solutions:
Assimilating diverse data integration tooling in SAP's unified digital data platform vision
enables components to share metadata, one design environment and administrative tooling,
and to operate as a hub that supports data management, analytics and application
requirements. A huge global customer base using its diverse data management infrastructures
and applications has led to extensive adoption of SAP's data integration tools.
CAUTIONS
Overcoming market perception of "SAP focus": SAP's approach to product development and
delivery emphasizes differentiation through its strategic SAP Hana platform and is well-
positioned for customers using or moving to Hana; however, it is perceived less favorably by
prospective buyers whose data doesn't predominantly reside in the SAP ecosystem.
Too many entry points into the organization: Implementations by reference clients cite
difficulties with the integrated deployment of SAP's offerings across its portfolio as more
products are added. This often takes place when the enterprise begins to address the growing
scale and complexity of usage scenarios in data integration activities that have been deployed
independently across the organization. SAP hopes to address the concern of too many entry
points with its newly released SAP Data Hub.
Latent concerns regarding customer support, service experience and skills: While customer
experience for SAP has improved overall, reference customer feedback indicates concerns
about both the processes for obtaining product support and the quality and consistency of
support services and skill availability, as areas needing further progress.
SAS
Based in Cary, North Carolina, U.S., SAS offers the following data integration products: SAS Data
Management (including SAS Data Integration Server and SAS Data Quality), SAS Federation
Server, SAS/Access interfaces, SAS Data Loader for Hadoop and SAS Event Stream Processing.
SAS's customer base for this product set is estimated to be 14,000 organizations.
STRENGTHS
Continued breadth (and integrated nature) of offerings: SAS leverages a strong metadata
strategy to link its MDM and data quality offerings with its data integration capabilities (which
are broad enough to include migration, bulk/batch, virtualization, message-based,
synchronization and streams processing). This breadth is particularly useful when processing,
managing and enriching data for analytics.
Embedded data integration for operational decisions: SAS Decision Manager is specifically
called out for its ability to create data correlations that flow into operational decision models.
This is a subtle advantage that is not entirely unique to SAS, but where SAS has strong
credentials to enhance its credibility. If a decision model is developed, it may actually require
different data inputs when deployed from many different sources — and, based upon the
business process model, may occur at different points along the process stream. SAS can
restate the metadata used to render the source inputs, and allow for a true "develop once, use
many" model that is traceable throughout the organization.
Interactive development: Most modern data integration tools include highly interactive
development interfaces that allow users to manipulate and work with data; in SAS's case,
another leverage point for metadata is the rapid integration of new sources. The introduction of
machine learning through SAS Viya has created a link between enterprise data integration and
data discovery. The close relationship between data auditing and profiling, and how they are
integrated with data integration development, is the key to quickly adding new analytics data
sources.
CAUTIONS
Market mind share constrained at times to SAS installed base: SAS data integration products
are still considered by many users as being specific add-on products for supporting SAS
analytics solutions. However, beginning before 2014 and continuing into 2017, SAS data
integration solutions have expanded their use cases beyond analytics only. SAS has
maintained a strategy of enhancing data integration to support analytics throughout the
duration of its participation in this market. It may simply not be possible (or even desirable) to
add a significant number of customers beyond SAS's core market, which is already a
substantial customer base.
Some inconsistent usage experiences: SAS Data Integration Studio is a component of many
other SAS products; as such, some customers express increasing difficulty to ensure full cross-
platform and complex-use-case compatibility. SAS continues to chase unexpected issues
(previously, fixes have been needed for ODS Graphics procedures, temporary file and directory
issues, inconsistent migration between product versions, inconsistent database references and
more) that are often related to isolated metadata integration needs and detract from the
otherwise significant metadata capabilities of its products.
Installation, upgrade, and migration concerns: Version upgrade difficulties are cited as
concerns by SAS's reference customers, highlighting the need to improve ease of installation,
reduce product complexity, and increase self-guided support for simplifying migration. The
roadmap of SAS Viya sets out to improve the installation and upgrade experience of
customers.
Syncsort
Based in Pearl River, New York, U.S., Syncsort offers DMX, DMX-h and Ironstream. Syncsort's
customer base for this product set is estimated to be around 2,000 organizations.
STRENGTHS
Attractive cost and low TCO: Syncsort's reference customers praise its competitive pricing and
low TCO compared with other data integration price leading vendors. Overall, low TCO is often
cited by its customers as one of main reasons for choosing Syncsort.
High performance ETL products: Syncsort builds its reputation on its high-performance ETL
and big data integration tools. Enabling on-premises and in-cloud deployments, which Syncsort
refers as its "design once, deploy anywhere" architecture, focuses on extending the flexibility
and scalability of data integration processing. In recent years, it has built strong partnerships
with well-known brands in big data such as Cloudera, Hortonworks and Splunk. Many
customers leverage Syncsort to integrate mainframe data and Hadoop.
Improved position within the broader data management market: The acquisition of Trillium (a
data quality tool leader) in November 2016, has positioned Syncsort to offer a more
comprehensive data integration and data quality solution. Syncsort has improved its ability to
support business-centric use cases such as customer 360-degree view, fraud detection, and
data governance.
CAUTIONS
Lack of mind share: Syncsort's data integration marketing and sales strategies are rooted in its
expertise and strengths in accessing and integrating mainframe data. This has worked well for
building a niche in the market with competitive differentiation, and a loyal customer base, but
the same tactic also works against Syncsort as it expands into new markets. Gartner inquiry
customers report little awareness of the Syncsort brand beyond the mainframe market, even
though it has expanded its offerings to embrace Hadoop and the cloud; for example, accessing
the data warehouse and integrating data into Hadoop data lakes.
Anticipated slow integration following the Trillium acquisition: Although Syncsort has crafted
a new product offering strategy based on the combined solutions of Syncsort and Trillium, and
has taken initial steps toward integrating them for data lake governance and customer 360-
degree use cases, we do not yet have a clear idea of how well these two sets of products will
integrate with each other on an architectural level (that is, shared metadata and common UIs).
Without this deeper level of integration, there would be fewer synergistic benefits for
customers.
Limited range of data integration capabilities: Although Syncsort's functionality continues to
be extended (for example, ingestion of the mainframe log data stream), its predominant data
integration capabilities remain ETL-centric. Although this reflects a specialization that Syncsort
chooses to focus on in order to better serve its customers, it also presents a competitive
disadvantage when data integration requirements include data virtualization. Syncsort is
expanding its integration styles through recently added real-time CDC capabilities for
populating Hadoop data lakes with changes in mainframe data, and certified support for
messaging integration (for Kafka, MapR Technologies MapR Streams, Apache NiFi and
connectivity to IBM MQ and Pivotal's RabbitMQ).
Talend
Based in Redwood City, California, U.S., Talend offers Talend Open Studio, Talend Data Fabric,
Talend Data Management Platform, Talend Platform for Big Data, Talend Data Services Platform,
Talend Integration Cloud and Talend Data Preparation. Talend's paying customer base for this
product portfolio is estimated at more than 1,500 organizations.
STRENGTHS
Cost model and flexibility: Through a scalable licensing model based on a per-developer
subscription fee, Talend allows customers to start with small, core data integration projects
and then grow their portfolio for more advanced data integration projects (such as integration
with Hadoop data stores).
Integrated portfolio for data integration and interoperability with complementary
technologies: Talend possesses a comprehensive portfolio of data integration and related
technology (including data quality, MDM, ESB, application integration and metadata
management), interoperates with Docker, and has recently added iPaaS and data preparation
capabilities. Gartner inquiry and reference customers alike report a robust product set, which
allows them to build and execute end-to-end data management projects and use cases and to
capitalize on data integration use cases that require synergy with their related technologies.
Strength in core data integration capabilities and delivery for evolving trends: Customers and
prospects are still drawn to Talend's robust core data integration capabilities (including the
bulk/batch movement of data), which continue to draw in a significant proportion of its buyer
base. Talend also has products catering to current and evolving market needs, including its
iPaaS offering (now supporting AWS, Google Cloud Platform and Microsoft Azure integration)
and data preparation; significant investment in data integration operations running natively on
Hadoop, and evolving operational uses cases (in the Apache Storm and Apache Spark
environments); planned features for data lake governance; and partnerships with Cloudera
Navigator and Hortonworks for their integration with Apache Atlas.
CAUTIONS
New release stability and implementation support: Reference customers' adoption
experiences have sometimes included problems with the stability and performance of Talend's
new releases, and also with finding adequate partners/skilled resources that are adept with
Talend design and implementation. Talend has launched a new partner certification program
and is working with partners to design new reference architectures.
Developer focus: Talend has its roots in open source, and with the technical community in
general. Current market momentum is moving strongly toward enabling multiple personas and
self-service integration options for nontechnical users. Talend has started addressing more
personas with self-service via Talend Data Preparation, hybrid cloud integration capabilities
through iPaaS, and support for information stewardship. Talend is investing in a new partner
certification program and training for partners and customers.
Lack of market awareness beyond bulk/batch: While Talend's capabilities resonate well with
traditional data delivery styles, reference customer concerns and Gartner inquiries both
indicate a need to increase awareness of its support for other data integration styles
(particularly replication/synchronization of data for real-time integration and data
virtualization). More comprehensive and integrated metadata management support across its
product portfolio is also desired.
Vendors Added and Dropped
We review and adjust our inclusion criteria for Magic Quadrants as markets change. As a result
of these adjustments, the mix of vendors in any Magic Quadrant may change over time. A
vendor's appearance in a Magic Quadrant one year and not the next does not necessarily indicate
that we have changed our opinion of that vendor. It may be a reflection of a change in the market
and, therefore, changed evaluation criteria, or of a change of focus by that vendor.
Added
Pentaho
Dropped
None
Inclusion and Exclusion Criteria
The inclusion criteria represent the specific attributes that analysts believe are necessary for
inclusion in this research.
To be included in this Magic Quadrant, vendors must possess within their technology portfolio
the subset of capabilities identified by Gartner as the most critical from within the overall range
of capabilities expected of data integration tools. Specifically, vendors must deliver the following
functional requirements:
Data delivery modes support — At least three modes are supported among bulk/batch data
movement, federated/virtualized views, message-oriented delivery, data replication,
streaming/event data, and synchronization. Vendors whose customer reference base fails to
represent, in any mix of their products' use, three of the following seven technical deployment
styles will be excluded:
Bulk/batch includes single or multipass/step processing that includes the entire contents of
the data file after an initial input or read of the file is completed from a given source or
multiple sources. All processes take place on multiple records within the data integration
application before the records are released for any other data-consuming application.
Message-oriented utilizes a single record in an encapsulated object that may or may not
include internally defined structure (XML), externally defined structures (electronic data
interchange), a single record or other source that delivers its data for action to the data
integration process.
Virtualization is the utilization of logical views of data, which may or may not be cached in
various forms within the data integration application server or systems/memory managed by
that application server. Virtualization may or may not include redefinition of the sourced data.
Replication is a simple copy of data from one location to another, always in a physical
repository. Replication can be a basis for all other types of data integration, but specifically
does not change the form, structure or content of the data it moves.
Synchronization can utilize any other form of data integration, but specifically focuses on
establishing and maintaining consistency between two separate and independently managed
create, read, update, delete (CRUD) instances of a shared, logically consistent data model for
an operational data consistency use case (may or may not be on the same data
management platform). Synchronization also maintains and resolves instances of data
collision with the capability to establish embedded decision rules for resolving such
collisions.
Streaming/event data consists of datasets that follow a consistent content and structure
over long periods of time and large numbers of records, and that effectively report status
changes for the connected device or application or continuously update records with new
values. Streaming/event processing includes the ability to incorporate event models, inferred
row-to-row integrity, and variations of either those models or the inferred integrity with
alternative outcomes that may or may not be aggregated and/or parsed into separate event
streams from the same continuous stream. The logic for this approach is embedded in the
data stream processing code.
Data services bus (SOA) capability is the ability to deploy any of the various data integration
styles, but with specific capability to interoperate with application services (logic flows,
interfaces, end-user interfaces, and so on) and pass instructions to, and receive instructions
from, those other services on the bus. Data services bus includes auditing to assist in service
bus management, either internally or by passing audit metadata to another participating
service on the bus.
Data transformation support — At a minimum, packaged capabilities for basic transformations
(such as data type conversions, string manipulations and calculations).
Demonstrably broad range of connectivity/adapter support (sources and targets) — Native
access to relational DBMS products, plus access to nonrelational legacy data structures, flat
files, XML and message queues, as well as emerging data asset types (such as JavaScript
Object Notation [JSON]).
Mode of connectivity/adapter support (against a range of sources and targets), support for
change detection, leveraging third-party and native connectors, connection and read error
detection, and integrated error handling for production operations.
Metadata and data modeling support — Automated metadata discovery (such as profiling new
data sources for consistency with existing sources), lineage and impact analysis reporting,
ability to synchronize metadata across multiple instances of the tool, and an open metadata
repository, including mechanisms for bidirectional sharing of metadata with other tools.
User- or role-specific variations in the development interface capable of various workflow
enhancement mechanisms, which may include supporting templates, version modification (via
internal library management or other mechanism), quality assurance capability via either
audit/monitor metadata (manual) or embedded workflows (administrator tools).
Design and development support — Graphical design/development environment and team
development capabilities (such as version control and collaboration). This includes multiple
versions running in disparate platforms and multiple instances of services deployments in
production environments as well as alternative or collaborating development environments.
Runtime platform support — Windows, Unix or Linux operating systems, or demonstrated
capability to operate on more than one commercially available cloud environment regardless of
the platform in operation.
Service enablement — The ability to deploy functionality as services, including multiple
operating platforms. The ability to manage and administer operations on multiple platforms
and environments is significantly desired.
Data governance support — Ability to import, export and directly access metadata with data
profiling and/or data quality tools, master data management tools and data discovery tools.
Accepting business and data management rule updates from data stewardship workflows and
sharing data profiling information with such tools is highly desired. No additional advantage is
perceived in data integration for also delivering actual data governance tools — the focus is
interoperability.
In addition, vendors had to satisfy the following quantitative requirements regarding their market
penetration and customer base. Vendors must:
Generate at least $25 million of their annual software revenue from data integration tools
(perpetual license subscription or maintenance/support), or maintain at least 300
maintenance-paying customers for their data integration tools. Gartner will use as many
independent resources for validating this information as possible, specifically to validate
provided information.
Support data integration tool customers in at least two of the following geographic regions:
North America, South America, Europe and Asia/Pacific.
Demonstrated market presence will also be reviewed and can be assessed through internal
Gartner search, external search engines, Gartner inquiry interest, technical press presence and
activity in user groups or posts. A relative lack of market presence could be determined as a
reason to exclude a product/service offering.
Vendors could be excluded if they focus on narrow use cases that are too specific for broader
market application. Some vendor/supplier tools were excluded because:
They focused on only one horizontal data subject area; for example, the integration of
customer-identifying data
They focused only on a single vertical industry
They served only their own, internally managed data models and/or architectures (this includes
tools that only ingest data to a single proprietary data repository) or were used by a single
visualization or analytics processing platform
Evaluation Criteria
Ability to Execute
Gartner analysts evaluate technology providers on the quality and efficacy of the processes,
systems, methods or procedures that enable IT providers' performance to be competitive,
efficient and effective, and to positively affect revenue, retention and reputation. Ultimately,
technology providers are judged on their ability to capitalize on their vision, and their success in
doing so.
We evaluate vendors' Ability to Execute in the data integration tool market by using the following
criteria:
Product/Service: Core goods and services that compete in and/or serve the defined market.
This includes current product and service capabilities, quality, feature sets, skills and so on.
This can be of