Streaming Audio: a Confluent podcast about Apache Kafka

Streaming Audio: a Confluent podcast about Apache Kafka
Streaming Audio is a podcast from Confluent, the team that built Apache Kafka®️. Host Tim Berglund (Senior Director of Developer Experience, Confluent) and guests unpack a variety of topics surrounding Apache Kafka, event stream processing and real-tim

Paving a Data Highway with Kafka Connect ft. Liz Bennett
Feb 12 • 46 min
The Stitch Fix team benefits from a centralized data integration platform at scale using Apache Kafka and Kafka Connect. Liz Bennett (Software Engineer, Confluent) got to play a key role building their real-time data streaming infrastructure. Liz…
Distributed Systems Engineering with Apache Kafka ft. Jun Rao
Feb 5 • 54 min
Jun Rao (Co-founder, Confluent) explains what relational databases and distributed databases are, how they work, and major differences between the two. He also delves into important lessons he’s learned along the way through the transition from the…
How to Write a Successful Conference Abstract | Streaming Audio Special
Feb 4 • 7 min
Learn how to write an abstract for conference submissions and call for papers with tips from Tim Berglund, chair of the Kafka Summit Program Committee.
Streaming Call of Duty at Activision with Apache Kafka ft. Yaroslav Tkachenko
Jan 27 • 46 min
Yaroslav Tkachenko shares about how matchmaking services, microtransactions, and telemetry statistics all play a role in Activision’s challenging (but fun) event streaming use cases. Learn about how Activision ingests huge amounts of data, what the…
Confluent Platform 5.4 | What’s New in This Release + Updates
Jan 22 • 14 min
A quick summary of new features, updates, and improvements in Confluent Platform 5.4, including Role-Based Access Control (RBAC), Structured Audit Logs, Multi-Region Clusters, Confluent Control Center enhancements, Schema Validation, and the preview for…
Making Apache Kafka Connectors for the Cloud ft. Magesh Nandakumar
Jan 13 • 25 min
Learn about connectors, how they simplify data integrations, and how they’re built for Confluent Cloud for use on major cloud providers like GCP, Azure, and AWS to help implement Apache Kafka within existing systems in an easy way.
Location Data and Geofencing with Apache Kafka ft. Guido Schmutz
Jan 8 • 48 min
One way to put Apache Kafka into action is geofencing and tracking the location data of objects, barges, and cars in real time. Guido Schmutz shares about one such use case involving a German steel company and the development project he worked on for them.
Multi-Cloud Monitoring and Observability with the Metrics API ft. Dustin Cote
Dec 30, 2019 • 42 min
Dustin Cote (Product Manager for Observability, Confluent Cloud) talks about Apache Kafka® made serverless and how beyond just the brokers, Confluent Cloud focuses on fitting into customer systems rather than building monitoring silos.
Apache Kafka and Apache Druid – The Perfect Pair ft. Rachel Pedreschi
Dec 23, 2019 • 50 min
Rachel Pedreschi’s involvement in the open source community focuses primarily on Apache Druid, a real-time, high-performance datastore that provides fast, sub-second analytics and complements another powerful open source project as well: Apache Kafka®.…
Apache Kafka 2.4 – Overview of Latest Features, Updates, and KIPs
Dec 16, 2019 • 15 min
Apache Kafka 2.4 includes new Kafka Core developments and improvements to Kafka Streams and Kafka Connect, including MirrorMaker 2.0, RocksDB metrics, and more.
Cloud-Native Patterns with Cornelia Davis
Dec 16, 2019 • 53 min
Host Tim Berglund catches up with Cornelia Davis, author of Cloud-Native Patterns and VP of Technology at Pivotal, on what cloud-native patterns are, the example code she created, her latest book, and how she wrote the book for the customers she interacts…
Ask Confluent #16: ksqlDB Edition
Dec 12, 2019 • 30 min
Vinoth Chandar and Gwen Shapira discuss what ksqlDB is, the kinds of applications that you can build with it, vulnerabilities, and various ksqlDB use cases. They also talk about what’s currently the best version of Apache Kafka version for performance…
Machine Learning with Kafka Streams, Kafka Connect, and ksqlDB ft. Kai Waehner
Dec 4, 2019 • 38 min
Kai Waehner defines machine learning in depth, describes the architecture of his dream machine learning pipeline, shares its relevance to Apache Kafka and the related ecosystem, and discusses the importance of security and fraud detection.
Real-Time Payments with Clojure and Apache Kafka ft. Bobby Calderwood
Nov 27, 2019 • 58 min
Payments leverages Confluent Cloud to help banks of all sizes transform to real-time banking services from traditionally batch-oriented, bankers’ hours operational mode. This is achieved through Apache Kafka® and the Kafka Streams and Kafka Connect APIs…
Announcing ksqlDB ft. Jay Kreps
Nov 20, 2019 • 26 min
Jay Kreps introduces ksqlDB, the event streaming database purpose-built for stream processing applications. As the successor to KSQL, ksqlDB is a specialized database for stream processing on top of Kafka, merging the concepts behind streams of data with…
Installing Apache Kafka with Ansible ft. Viktor Gamov and Justin Manchester
Nov 18, 2019 • 46 min
Ansible keeps your Apache Kafka® deployment, management, and installation consistent, and helps you implement best practices that make it easy to get started. Justin Manchester (Platform DevOps Engineer, Confluent) and Viktor Gamov (Developer Advocate,…
Securing the Cloud with VPC Peering ft. Daniel LaMotte
Nov 13, 2019 • 31 min
With a virtual private cloud (VPC)—your own private network in the cloud that you can launch your own instances into—you can secure your cloud infrastructure and minimize the threat of potential attackers with VPC Peering, connecting VPCs together to…
ETL and Event Streaming Explained ft. Stewart Bryson
Nov 6, 2019 • 49 min
Stewart Bryson dispels misconceptions around what “streaming ETL” means, and explains why event streaming and event-driven architectures compel us to rethink old approaches.
The Pro’s Guide to Fully Managed Apache Kafka Services ft. Ricardo Ferreira
Nov 4, 2019 • 56 min
What’s the difference between a hosted solution and a managed solution? What about a partially managed solution versus a fully managed one? Ricardo Ferreira breaks down what a managed Kafka service truly means and why every developer should care.
Kafka Screams: The Scariest JIRAs and How To Survive Them ft. Anna McDonald
Oct 30, 2019 • 46 min
In today’s spooktacular episode of Streaming Audio, Anna McDonald discusses six of the scariest Apache Kafka® JIRAs.
Data Integration with Apache Kafka and Attunity
Oct 28, 2019 • 43 min
From change data capture (CDC) to business development, connecting Apache Kafka® environments, and customer success stories, Graham Hainbach discusses the possibilities of data integration with Kafka and Attunity using Replicate, Compose, and Manager. He…
Distributed Systems Engineering with Apache Kafka ft. Colin McCabe
Oct 23, 2019 • 45 min
Colin McCabe shares about what it’s like being a distributed systems engineer, how it differs from being a full stack engineer, and the importance of open source community involvement.
Apache Kafka on Kubernetes, Microsoft Azure, and ZooKeeper with Lena Hall
Oct 16, 2019 • 46 min
Lena Hall joins Tim Berglund in the studio to talk about Apache Kafka, the various ways to run Kafka on Microsoft Azure, Kafka on Kubernetes (K8s), and some exciting events that are happening in the Kafka world.
Improving Fairness Through Connection Throttling in the Cloud with KIP-402 ft. Gwen Shapira
Oct 9, 2019 • 48 min
Gwen Shapira outlines KIP-402, which aims to improve fairness in how Apache Kafka® processes connections and how network threads pick up requests and new data. She also shares about her team’s efforts to make user-facing Kafka improvements.
Data Modeling for Apache Kafka – Streams, Topics & More with Dani Traphagen
Oct 7, 2019 • 40 min
When it comes to data modeling, Dani Traphagen covers importance business requirements, including the need for a domain model, practicing domain-driven design principles, and bounded context. She also discusses the attributes of data modeling: time,…
MySQL, Cassandra, BigQuery, and Streaming Analytics with Joy Gao
Oct 2, 2019 • 43 min
Joy Gao chats with Tim Berglund about all things related to streaming ETL—how it works, its benefits, and the implementation and operational challenges involved. She describes the streaming ETL architecture at WePay from MySQL/Cassandra to BigQuery using…
Scaling Apache Kafka with Todd Palino
Sep 25, 2019 • 46 min
Todd Palino talks about the start of Apache Kafka® at LinkedIn, what learning to use Kafka was like, how Kafka has changed, and what he and others in the community hope for in the future of Kafka.
Understand What’s Flying Above You with Kafka Streams ft. Neil Buesing
Sep 23, 2019 • 13 min
Neil Buesing (Director of Real-Time Data, Object Partners) discusses what a day in his life looks like and how Kafka Streams helps analyze flight data.
KIP-500: Apache Kafka Without ZooKeeper ft. Colin McCabe and Jason Gustafson
Sep 18, 2019 • 43 min
Colin McCabe and Jason Gustafson discuss the history of Kafka, the creation of KIP-500, and the implications of removing ZooKeeper dependency and replacing it with a self-managed metadata quoroum.
Should You Run Apache Kafka on Kubernetes? ft. Balthazar Rouberol
Sep 16, 2019 • 29 min
What are the maturing stages of Kubernetes adoption? How did Datadog experience these stages? Balthazar Rouberol explains what to think about before hopping on Kubernetes hype train.
Jay Kreps on the Last 10 Years of Apache Kafka and Event Streaming
Sep 12, 2019 • 48 min
Jay Kreps to talk about stream processing, his early coding days at LinkedIn, starting Confluent, the highs, the lows, and everything in between.
Connecting to Apache Kafka with Neo4j
Sep 9, 2019 • 54 min
Michael Hunger and David Allen discuss Neo4j basics and major features introduced in Neo4j 3.4.15. They’ll cover the history of the integration and features in relation to Apache Kafka®, change data capture (CDC), using Neo4j to put graph operations into…
Ask Confluent #15: Attack of the Zombie Controller
Sep 4, 2019 • 22 min
Gwen Shapira answers your questions on creating tables in nested JSON topics, how to balance ordering, latency and reliability, building event-based systems, and how to navigate the tricky endOffsets API. She also discusses the hardships of fencing zombie…
Helping Healthcare with Apache Kafka and KSQL ft. Ramesh Sringeri
Aug 28, 2019 • 52 min
Tim Berglund sits down with Ramesh Sringeri to discuss two Kafka use cases that Children’s Healthcare of Atlanta is working on: achieving near-real-time streams of data to support meaningful intracranial pressure prediction and better manage intracranial…
Contributing to Open Source with the Kafka Connect MongoDB Sink ft. Hans-Peter Grahsl
Aug 21, 2019 • 50 min
Tim Berglund invites Hans-Peter Grahsl to share about his involvement in the Apache Kafka® project, spanning from several conference contributions all the way to his open source community sink connector for MongoDB, now part of the official MongoDB Kafka…
Teaching Apache Kafka Online with Stéphane Maarek
Aug 19, 2019 • 42 min
Streaming Audio welcomes Stéphane Maarek to discuss how he got started with online teaching on Udemy, the challenges he faces as an instructor, his approach to answering hard questions, and the projects he is currently working on.
Connecting Apache Cassandra to Apache Kafka with Jeff Carpenter from DataStax
Aug 12, 2019 • 47 min
Whenever you see an Apache Cassandra™ in the wild, you probably also see an Apache Kafka®️. In this episode, Tim Berglund and Jeff Carpenter discuss the best way to get those systems talking using the DataStax Apache Kafka Connector and build a real-time…
Transparent GDPR Encryption with David Jacot
Aug 8, 2019 • 16 min
GDPR has challenged many digital enterprises to rethink the ways how they are dealing with customer data. Viktor Gamov chats with David Jacot about a unique approach to inter-broker traffic encryption that he implemented for his customer’s sidecar pattern…
Confluent Platform 5.3 | What’s New in This Release
Jul 31, 2019 • 13 min
A quick summary of the most important features in Confluent Platform 5.3. We discuss improved Kubernetes and Ansible support, improvements to Confluent Control Center that give you better insight into the data in your cluster, and an important new set of…
How to Convert Python Batch Jobs into Kafka Streams Applications with Rishi Dhanaraj
Jul 29, 2019 • 31 min
Rishi Dhanaraj worked at Zenreach as an intern and took on a big pile of Python batch jobs, turning them into some really interesting Kafka Streams code. Listen in as he walks us through how he did it.
Ask Confluent #14: In Control of Kafka with Dan Norwood
Jul 22, 2019 • 23 min
Is Apache Kafka® actually a database? Can you install Confluent Control Center on Google Cloud Platform (GCP)? All this, plus some tips from Dan Norwood, the first user of Kafka Streams.
Kafka in Action with Dylan Scott
Jul 15, 2019 • 38 min
Author Dylan Scott tells all about his upcoming Manning title Kafka in Action, which shares how Apache Kafka® can be used by beginners who are just starting out their own projects. It also dispels common Hadoop-related myths, as Kafka has grown to become…
Change Data Capture with Debezium ft. Gunnar Morling
Jul 10, 2019 • 49 min
Gunnar Morling shares a little bit about what Debezium is, how it works, and which databases it supports. In addition to covering the various use cases and benefits from change data capture in the context of microservices, Gunnar walks us through the…
Distributed Systems Engineering with Apache Kafka ft. Jason Gustafson
Jul 2, 2019 • 45 min
Jason Gustafson dives into the challenges of working on distributed systems, particularly when it comes to a unique system like Apache Kafka. He also discusses ways in which Confluent is working with the community to solve active problems, and what it…
Apache Kafka 2.3 | What’s New in This Release + Updates and KIPs
Jun 25, 2019 • 13 min
Tim Berglund (Senior Director of Developer Experience, Confluent) explains the Kafka Improvement Proposals (KIPs) and what’s new in Apache Kafka 2.3.
Rolling Kafka Upgrades and Confluent Cloud ft. Gwen Shapira
Jun 25, 2019 • 42 min
If you operate a Kafka cluster, hopefully you upgrade your brokers occasionally. Well, what if you have to do it with hundreds or thousands of brokers, such as you’d have to do if you were running Confluent Cloud? Today, Gwen Shapira shares some of the…
Deploying Confluent Platform, from Zero to Hero ft. Mitch Henderson
Jun 18, 2019 • 32 min
Mitch Henderson explains how to plan and deploy your first application running on Confluent Platform. He covers critical factors to consider, how to make decisions about deployment solutions, how to go about setting up monitoring and testing, the marks of…
Why Kafka Connect? ft. Robin Moffatt
Jun 12, 2019 • 46 min
Tim and Robin cover the motivating factors for Kafka Connect, why people end up reinventing the wheel when they’re not aware of it and Kafka Connect’s capabilities. They also talk about the importance of schemas in Apache Kafka® pipelines and programs,…
Schema Registry Made Simple by Confluent Cloud ft. Magesh Nandakumar
Jun 3, 2019 • 41 min
Tim Berglund and Magesh Nandakumar discuss why schemas matter for building systems on Apache Kafka®, and how Confluent Schema Registry helps with the problem. They talk about how Schema Registry works, how you can collaborate around schema change through…
Why is Stream Processing Hard? ft. Michael Drogalis
May 29, 2019 • 45 min
Tim Berglund and Michael Drogalis talk about all things stream processing: why it’s complex, how it has evolved, and what’s on the horizon to make it simpler.
Testing Kafka Streams Applications with Viktor Gamov
May 20, 2019 • 42 min
Chris Riccomini tells us how Apache Kafka® and the stream processing framework Samza came about, and also what he’s doing these days at WePay—building systems that use Kafka as a primary datastore.
Chris Riccomini on the History of Apache Kafka and Stream Processing
May 16, 2019 • 50 min
Chris Riccomini tells us how Apache Kafka® and the stream processing framework Samza came about, and also what he’s doing these days at WePay—building systems that use Kafka as a primary datastore.
Ask Confluent #13: Machine Learning with Kai Waehner
May 8, 2019 • 33 min
Gwen and Kai chat about machine learning architectures, and whether software engineers and data science can learn to get along.
Load-Balanced Apache Kafka: Derivco’s Globally Distributed Gaming Business
May 2, 2019 • 38 min
Derivco must maintain central data infrastructure and application clusters in geographically diverse locations due to the heavily regulated nature of their business. In this episode, we talk about the challenges associated with that unusual but ingenious…
Diving into Exactly Once Semantics with Guozhang Wang
Apr 22, 2019 • 47 min
Kafka Streams Engineer Guozhang Wang walks through the implementation of transactional messaging in Kafka in some detail, including the idempotent producer API, the transaction coordinator responsible for managing the transaction log and consumer…
Ask Confluent #12: In Search of the Lost Offsets
Apr 17, 2019 • 22 min
Stanislav Kozlovski joins us to discuss common pitfalls when using Kafka consumers and a new KIP that promises to make consumer restarts much smoother.
Ben Stopford on Microservices and Event Streaming
Apr 8, 2019 • 58 min
In this podcast, Ben Stopford explores the event-driven paradigm and how it relates to the microservice architectures we build today. Ben dives deep into coupling, evolution and challenges of our increasingly data-oriented culture.
Magnus Edenhill on librdkafka 1.0
Apr 3, 2019 • 46 min
librdkafka has finally reached 1.0! Several important new features include the idempotent producer, sparse broker connections, support for the vaunted KIP-62 and a complete makeover for the C#/.NET client.
Ask Confluent #11: More Services, More Metrics, More Fun
Mar 26, 2019 • 14 min
Do metrics for detecting clients from old versions actually exist? Or is Gwen making features up? This and more useful advice is coming up on today’s episode of Ask Confluent.
It’s Time for Streaming to Have a Maturity Model ft. Nick Dearden
Mar 18, 2019 • 36 min
Nick Dearden explains the five stages of streaming maturity, from the first streaming project you ever build all the way to a state where an entire organization is transformed to think in terms of real-time, event-driven systems.
Containerized Kafka On Kubernetes with Viktor Gamov
Mar 11, 2019 • 41 min
Tim Berglund and Viktor Gamov address some of the challenges and pitfalls of managing Kafka on Kubernetes at scale. They also share lessons learned from the development of the Confluent Operator for Kubernetes.
Catch Your Bus with KSQL: A Stream Processing Recipe by Leslie Kurt
Mar 4, 2019 • 19 min
Use KSQL to calculate the difference between the expected arrival time and real-time updates of a bus as it executes its route. Leslie Kurt walks you through fundamental concepts, persistent queries, Confluent MQTT Proxy and other use cases.
KTable Update Suppression (and a Bunch About KTables) ft. John Roesler
Feb 27, 2019 • 45 min
Apache Kafka 2.1 featured an interesting change to the table API—commonly known to the world as KIP-328—that gives you better control over how updates to tables are emitted into destination topics. Join John Roesler for a clear explanation of it.
Splitting and Routing Events with KSQL ft. Pascal Vantrepote
Feb 25, 2019 • 20 min
Tim Berglund chats with Systems Engineer Pascal Vantrepote about the KSQL recipe he created based on a real-life customer use case in the financial services industry. They also discuss the advantages of KSQL, including no Java coding required.
Ask Confluent #10: Cooperative Rebalances for Kafka Connect ft. Konstantine Karantasis
Feb 20, 2019 • 21 min
Gwen Shapira speaks with Konstantine Karantasis, software engineer at Confluent, about the latest improvements to Kafka Connect and how to run the Confluent CLI on Windows.
The Future of Serverless and Streaming with Neil Avery
Feb 14, 2019 • 41 min
Neil Avery explores the intersection between FaaS and event streaming applications, the pros and cons of FaaS, important considerations when building streaming applications and five rules that will help you understand how FaaS fits with the event…
Using Terraform and Confluent Cloud with Ricardo Ferreira
Jan 23, 2019 • 28 min
Tim Berglund hosts Developer Advocate Ricardo Ferreira to discuss the concept of infrastructure as code, as well as the differences between Terraform, Ansible, Puppet and Chef.
Ask Confluent #9: With and Without ZooKeeper
Jan 8, 2019 • 15 min
Gwen asks: What happens when garbage collection causes Kafka to pause? And how do we run a Schema Registry cluster? We’ll find out in this episode of Ask Confluent.
Ask Confluent #8: Guozhang Wang on Kafka Streams Standby Tasks
Dec 18, 2018 • 22 min
Gwen is joined in studio by special guest Guozhang Wang, Kafka Streams pioneer and engineering lead at Confluent. He’ll talk to us about standby tasks and how one deserializes message headers.
Ask Confluent #7: Kafka Consumers and Streams Failover Explained ft. Matthias Sax
Dec 3, 2018 • 23 min
Gwen is joined in studio by special guest Matthias J. Sax, a software engineer at Confluent. He’ll talk to us about Kafka consumers and Kafka Streams failover.
Ask Confluent #6: Kafka, Partitions, and Exactly Once ft. Jason Gustafson
Nov 5, 2018 • 22 min
Gwen is joined in studio by special guest Jason Gustafson, a Kafka PMC member and engineer at Confluent. He’ll talk to us about the big questions on Kafka architecture—number of partitions and exactly once.
Kafka Summit SF 2018 Panel | Microsoft, Slack, Confluent, University of Cambridge
Oct 18, 2018 • 34 min
Neha Narkhede leads a panel discussion at Kafka Summit SF 2018 with Kevin Scott (CTO, Microsoft), Julia Grace (Head of Infrastructure Engineering, Slack), Martin Kleppman (Researcher, U. of Cambridge), Jay Kreps (co-founder and CEO, Confluent), and Neha…
Kafka Streams in Action with Bill Bejeck
Sep 27, 2018 • 49 min
Tim Berglund interviews Bill Bejeck about the Kafka Streams API and his new book, “Kafka Streams in Action.”
Joins in KSQL 5.0 with Hojjat Jafarpour
Sep 20, 2018 • 29 min
KSQL 5.0 now supports stream-stream, stream-table and table-table joins. Tim interviews Hojjat Jafarpour about all three join types.
Ask Confluent #5: Kafka, KSQL and Viktor Gamov
Sep 10, 2018 • 31 min
Gwen is joined by co-host Tim Berglund and special guest, Viktor Gamov (Developer Advocate, Confluent), who specializes in Kafka, KSQL and Kubernetes.
KSQL Use Cases with Nick Dearden
Sep 6, 2018 • 32 min
Tim Berglund discusses how people actually use KSQL with Nick Dearden, Stream Processing Expert at Confluent.
Nested Data in KSQL with Hojjat Jafarpour
Aug 29, 2018 • 13 min
Tim Berglund discusses nested data in KSQL with Hojjat Jafarpour, a software engineer on the KSQL team at Confluent.
UDFs and UDAFs in KSQL 5.0 with Hojjat Jafarpour
Aug 24, 2018 • 18 min
Tim Berglund discusses UDFs and UDAFs in KSQL 5.0 with Hojjat Jafarpour, a software engineer on the KSQL team at Confluent.
Ask Confluent #4: The GitHub Edition
Aug 16, 2018 • 13 min
Gwen answers questions from YouTube and walks through how to use GitHub Issues to request features.
Deep Dive into KSQL with Hojjat Jafarpour
Aug 13, 2018 • 33 min
Tim Berglund takes a deep dive into KSQL with Hojjat Jafarpour, a software engineer on the KSQL team at Confluent.
Ask Confluent #3: Kafka Upgrades, Cloud APIs and Data Durability
Jul 20, 2018 • 22 min
Tim Berglund and Gwen Shapira answer your questions and have a discussion with Koelli Mungee (Customer Operations Lead, Confluent).
Ask Confluent #2: Consumers, Culture and Support
Jul 2, 2018 • 24 min
Gwen Shapira answers your questions and interviews Sam Hecht (Head of Support, Confluent).
Ask Confluent #1: Kubernetes, Confluent Operator, Kafka and KSQL
Jun 20, 2018 • 22 min
Tim Berglund and Gwen Shapira discuss Kubernetes, Confluent Operator, Kafka, KSQL and more.