Confluent unveiled a host of new features for its cloud-based data streaming platform, highlighted by an expansion of its Stream Governance Suite of tools and new sharing capabilities.
The vendor provides Confluent Cloud as a managed service for its cloud customers and the Confluent Platform for on-premises users. Both are built on Apache Kafka, an open source technology for data streaming, and enable clients to capture and ingest data from disparate sources as events occur to promote real-time decisions. .
The new Confluent Cloud capabilities were introduced on May 16 during the Kafka Summit London, a conference organized by Confluent for developers, data engineers and other users of Apache Kafka.
Prior to this most recent update, platform improvements to Confluent included added governance to stream data pipelines in October 2022 and tools making it easier to integrate Confluent with multi-cloud deployments in July 2022.
Confluent first launched Stream Governance in 2021, providing customers with fully managed governance capabilities for Apache Kafka and other streaming data tools.
In October 2022, the vendor added Stream Governance Advances, which is designed to help organizations control complex pipelines and manage how streaming data can be shared and used.
Now Confluent has added data quality rules to the governance suite with the aim of ensuring the quality of data streams so that they are ready for consumption and flexible to change over time.
The tool automatically validates the values of individual fields within the data stream to ensure data integrity, enables data engineers to quickly resolve data quality issues with customizable actions, and streaming Uses migration rules to convert messages from one data format to another so that the stream remains consistent even when new data is ingested.
Streaming data quality is an ongoing challenge for many organizations, according to Kevin Petrie, an analyst with Akerson Group. Many organizations have started using third-party data observation tools to address data quality.
As a result, adding data quality rules will likely become an attractive feature for users of Confluent’s platform.
“Enterprises feel constant pain when it comes to data quality,” Petrie said. “By implementing rules for data validation, resolution and schema development, Confluent can reduce the risk of quality issues without implementing third-party tools. This reduces administration effort and provides data consumers with more valuable, reliable provides input.
Stewart Bond, an analyst at IDC, similarly noted the potential importance of data quality regulations. He cited a recent IDC study, which found that streaming data was one of the least trusted enterprise data sources.
“That’s why Confluent is investing in stream governance capabilities,” Bond said. “Data quality capabilities in stream are adding a form of data observability to motion use cases, providing opportunities to identify and correct data quality issues before downstream systems are affected.”
In addition to data quality rules, a tool called stream sharing has significant potential for Confluent users, Bond continued.
Stream Sharing is designed to allow customers to share streaming data within their organization as well as with external Kafka users. It does this with built-in authenticated sharing, access management, and other security and governance measures.
Bond said that part of the intent of Kafka is to enable sharding. But the most important thing about stream sharing is that it can open the closed environment of one organization to other organizations.
“What’s interesting about this announcement is adding an external element,” he said. “Business-to-business data exchange is still very complex, using point-to-point APIs, managed file transfers and electronic data exchange. Adding the ability to share data in nearly real-time … in the way of data could cause a significant disruption in the exchange between business partners in the industry ecosystem.”
Other New Features
Beyond data quality rules and stream sharing, Confluent’s latest platform update includes three other new features:
custom connectors. Prebuilt connectors that enable users to link any data system to their organization’s Kafka Connect plugins without the need to change code, ensure the health of users’ connectors with logs and metrics, and keep their connector infrastructure consistent Eliminates the burden of managing. Cora. An Apache Kafka engine built for the cloud, Confluent Cloud is built to enable users to scale significantly faster than ever before, while also removing data storage limits and powering low-latency workloads. Early access to Confluent’s Apache Flink. A stream processing tool designed to handle large-scale data streams.
According to Bond, one of the main strengths of Confluent is its close ties with the Kafka community. That close connection includes Jun Rao, co-founder of Confluent and co-creator of Kafka.
As a result, Confluent is often at the forefront of new innovations related to Kafka, as evidenced by the development of custom connectors and Quora. The vendor’s rivals include Cloudera and Tibco as well as tech giants AWS, Google and Microsoft, which all offer data streaming platforms.
Enterprises always feel the pain when it comes to data quality. By implementing rules for data validation, resolution, and schema development, Confluent can reduce the risk of quality issues without implementing third-party tools. Kevin Patrianalyst, Ackerson Group
“We generally see Confluent ahead of its competitors because of its strong ties and influence in the Kafka community,” Bond said. “Confluent continues to be at the forefront of innovation in Kafka — partly because of its heritage and also because of the experience Confluent has gained managing Kafka environments for customers.”
While Confluent’s relationship with Kafka may be one of its strengths, the focus on a single tool may also hold it back.
Bond said that Kafka is not the only tool that can be used to move streaming data. Therefore, Confluent would be wise to expand its relationship with other event streaming platforms such as Pulsar, a fast-growing competitor to Kafka, he said.
“Confluent wants to be the software vendor that is synonymous with ‘data in motion,’” Bond said. “But Kafka is not the only technology that can be used to move data. While market penetration is lower, Confluent may consider supporting alternative data movement technologies such as Pulsar.”
Meanwhile, Petrie said Confluent’s added support for Apache Flink when it becomes generally available will benefit users and help the vendor stay competitive with its peers.
Most organizations use Kafka, he said. However, now many people are adding Flink.
“Flink enables specialized stream processing, which has become increasingly important to support real-time machine learning use cases,” Petrie said. “So Confluent is perfect to add Apache Flink capabilities as well.”
Confluent’s latest platform update includes features aimed at better streaming data governance and expanded streaming data sharing, it doesn’t include a technology that many data management and analytics vendors are now trying to incorporate into their platforms: Generative AI .
In the six months since OpenAI introduced ChatGPT, not only have Microsoft and Google used the technology to improve search engines, but data vendors have also – despite concerns about its security and accuracy – improved their query and search results. Has begun to develop capabilities to improve machine learning tools.
For example, both Tableau and ThoughtSpot are among analytics vendors integrating generative AI across their platforms, while Informatica is adding generative AI to its existing AI engine.
According to Bond, generative AI is, therefore, something that Confluent could potentially add to its entire platform.
“There’s obviously a lot of hype right now regarding generative AI,” he said. “It will be interesting to see how Confluent will address this in the context of Kafka and Confluent Cloud.”
Eric Avidon is a TechTarget editorial senior news writer and journalist with more than 25 years of experience. He covers analytics and data management.