CIO Bulletin

Confluent for a Single, Real-time Source of data â€“ Data that drives Businesses

Many digital businesses today are bombarded with loads of data from multiple sources such as networking apps, IoT, and more. Given the value this Big Data can generate, most businesses are after a clean and optimized system that can help them capitalize on all that data. Most traditional systems are already on the verge of a breakdown – incapable of meeting the needs of a modern data-driven organization. Confluent and Apache Kafka together provide the streaming platform that enables enterprises to maximize the value of data. Confluent was founded by the creators of open source, stream-processing software platform Apache Kafka. The 2014-founded Company is based in Palo Alto, California.

Confluent, which made it to the Forbes Cloud 100 list, recently, was also the recipient of Datanami’s Readers’ and Editors Choice Awards 2018. Neha Narkhede cofounded Confluent along with Jay Kreps, and she was ranked #3 in the Quartz at Work’s list of a rising generation of female entrepreneurs in the U.S. The four-year-old company with over 300 employees—up from 3 in 2014, has raised nearly $81 million, and is already competing with big sharks like Oracle and IBM in the multibillion-dollar market for middleware.

Jay Kreps, the Co-founder and CEO of Confluent said, “We prefer to focus on a smaller number of metrics of things we care about. For sales, it's bookings and the pipeline. For marketing, it’s the quality of the leads they feed to the pipeline. Engineering I can’t measure… building stuff that’s consistent with our vision."

Confluent Platform and Confluent Cloud

Apache Kafka-based Confluent platform brings together every part of an organization around a single source of truth – single data source. It enables organizations to know and respond to every single event in real-time. Kafka is a streaming platform capable of handling trillions of events a day. Kafka is a distributed, fault-tolerant, highly scalable publish-subscribe queue for handling real-time data feeds. It is implemented by thousands of companies globally.

Confluent Platform, on the other hand, collects, stores, and distributes data between various systems, especially Apache Kafka. It serves as an extra layer to Kafka – it simplifies connecting data sources to Kafka; implements methods to ensure the streams are secure; adds tools to optimize and manage Kafka clusters; simplifies application building in Kafka; and it also improvises Kafka’s integration capabilities. On a whole, the Confluent Platform enables businesses to organize and manage data from many sources using one reliable, high-performance system. Confluent Platform lets businesses focus on deriving values from the data, and not worrying about the mechanisms involved to organize and manage data.

Confluent offers Confluent Platforms in two flavors: Confluent Platform Open Source and Confluent Platform Enterprise. Open Source is a right choice for beginners – for those who are just setting-up a Kafka-based streaming platform. It includes services and tools frequently used with Kafka – Clients for C, C++, Python, and Go Programming languages; Connectors for JDBC, ElasticSearch, and HDFS; Schema Registry for managing metadata for Kafka topics; REST Proxy for integrating with web applications.

Enterprise version is a more advanced model designed to address the requirements of modern enterprise streaming applications. The components bundled in this version helps organizations build a consistent and flexible enterprise-wide streaming platform, for a wide array of use cases. It mainly includes Confluent Control Center for end-to-end monitoring, MDC Replication for managing multiple data center deployments, and Automatic Data Balancing for efficient resource utilization and scalability of Kafka clusters.

Confluent Cloud is a scalable data streaming service on the cloud. It compromises of open-source Apache Kafka APIs and has a web and a local command line interface (CLI). The open-source components allow users to recreate data pipelines outside of Confluent Cloud and migrate it whenever required. Also, the platform-agnostic Confluent Cloud grants businesses the flexibility to lift and shift Kafka applications from one location to another, within or outside the Confluent Cloud.

The Winning Strategy

Confluent foresees opportunities across industries. Talking about introducing Kafka to the automotive industry, Kreps said, “We can connect data from every vehicle and enable self-driving and mapping. The product is used by cruise ships and big heaving machinery.” Confluent is one of the fastest-growing startups that has managed to grow its subscription revenue four-fold in 2017 and is “looking forward to doing the same in 2018,” said Kreps. To support this statement is Confluent’s scalable business model, which as Kreps explains, “We offer a subscription product and as companies get more value, they pay more.”

The Kafka Master – Jay Kreps

Jay Kreps, Co-founder, and CEO of Confluent is the original author of several open-source projects including Apache Kafka, Apache Samza, Voldemort, and others. The lead architect for data infrastructure at LinkedIn is an alumnus of the University of California, Santa Cruz.

Kreps helped develop Kafka to streamline data operations and to reduce the complexity of distributed systems caused by Big Data and technologies like IoT etc. His invention won business from Netflix, Uber, Airbnb, Goldman Sachs and Target.

“We are focusing on building a streaming platform to help other companies get easy access to enterprise data as real-time streams.”

“In a data-driven enterprise, how we move our data becomes nearly as important as the data itself. With greater speed and agility, data’s value increases exponentially.”

“Growth requires investment which is not bad if the core economics are sound."