[{"content":"In Hindu tradition, Chitragupta is the deity who maintains a complete record of every being\u0026rsquo;s actions. The divine accountant. Fitting name for a system that tracks exactly who used what and how much it cost.\nThis is a ground-up rewrite of the CCloud Chargeback Helper, rebuilt with a full plugin architecture so I can add new ecosystems without touching the core engine every time. It correlates Confluent Cloud Billing, Metrics, and Core Objects APIs to produce hourly, identity-level cost attribution down to service accounts, users, API keys, and individual resources.\nThe v2 is essentially an entirely new system with a lot more features and a much better performance profile.\nDocumentation Source Code ","date":"26 March 2026","externalUrl":null,"permalink":"/projects/chitragupta/","section":"Projects","summary":"In Hindu tradition, Chitragupta is the deity who maintains a complete record of every being’s actions. The divine accountant. Fitting name for a system that tracks exactly who used what and how much it cost.\n","title":"Chitragupta","type":"projects"},{"content":"When multiple teams share a Confluent Cloud environment but a single CoE team manages everything, the billing question comes up fast: who\u0026rsquo;s actually using what, and how much does it cost?\nThis tool correlates Confluent Cloud Billing, Metrics, and Core Objects APIs to produce hourly, identity-level chargeback and showback datasets. Cost attribution goes all the way down to service accounts, users, API keys, and individual resources.\nThis project has since been superseded by Chitragupta, a complete rewrite with a plugin architecture and support for additional ecosystems beyond Confluent Cloud.\nSource Code ","date":"1 January 2024","externalUrl":null,"permalink":"/projects/ccloud-chargeback-helper/","section":"Projects","summary":"When multiple teams share a Confluent Cloud environment but a single CoE team manages everything, the billing question comes up fast: who’s actually using what, and how much does it cost?\n","title":"CCloud Chargeback Helper","type":"projects"},{"content":"Most monitoring setups for Kafka require long and lengthy parsing configurations to get JMX metrics into your telemetry system. This project standardizes that pipeline across Prometheus/Grafana, New Relic, Elastic/Kibana, Datadog, and OpenTelemetry.\nI\u0026rsquo;ve been one of the contributors on this project at Confluent. We built and maintained these stacks over several years, and they became the canonical observability quickstart for Confluent Platform and Cloud.\nSource Code ","date":"1 May 2022","externalUrl":null,"permalink":"/projects/jmx-monitoring-stacks/","section":"Projects","summary":"Most monitoring setups for Kafka require long and lengthy parsing configurations to get JMX metrics into your telemetry system. This project standardizes that pipeline across Prometheus/Grafana, New Relic, Elastic/Kibana, Datadog, and OpenTelemetry.\n","title":"JMX Monitoring Stacks","type":"projects"},{"content":"Starting a Kafka cluster is easy. Sizing it correctly is where things get interesting. Most teams start with 3 brokers because that\u0026rsquo;s the minimum, and then hope it\u0026rsquo;s enough. This tool replaces the hope with math.\nIt takes your expected throughput, retention, replication factor, and consumer count, then calculates disk, network, and partition requirements so you know your choke points before you hit them.\nSource Code ","date":"20 September 2019","externalUrl":null,"permalink":"/projects/kafka-broker-sizing/","section":"Projects","summary":"Starting a Kafka cluster is easy. Sizing it correctly is where things get interesting. Most teams start with 3 brokers because that’s the minimum, and then hope it’s enough. This tool replaces the hope with math.\n","title":"Kafka Broker Sizing Calculator","type":"projects"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/blog/","section":"Blog","summary":"","title":"Blog","type":"blog"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/tags/chitragupta/","section":"Tags","summary":"","title":"Chitragupta","type":"tags"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/tags/cost-attribution/","section":"Tags","summary":"","title":"Cost-Attribution","type":"tags"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/categories/finops/","section":"Categories","summary":"","title":"Finops","type":"categories"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/tags/finops/","section":"Tags","summary":"","title":"Finops","type":"tags"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/","section":"Home","summary":"","title":"Home","type":"page"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/tags/kafka/","section":"Tags","summary":"","title":"Kafka","type":"tags"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/tags/prometheus/","section":"Tags","summary":"","title":"Prometheus","type":"tags"},{"content":"","date":"8 April 2026","externalUrl":null,"permalink":"/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"\u0026ldquo;Which topic is costing us?\u0026rdquo; is the question every platform team gets, and nobody can answer it cleanly. SaaS providers bill you at the cluster level. Your finance team wants a line item per team. Your product managers want to know if their pipeline is the expensive one. You have a Kafka cluster, a bill, and no bridge between them.\nThe usual answer is \u0026ldquo;we\u0026rsquo;ll figure it out later.\u0026rdquo; Later never comes.\nChitragupta has had cluster-level chargeback since day one. But cluster-level only helps when identity-level costs are all you need. It doesn\u0026rsquo;t tell you which topic is the most expensive. Topic attribution in Chitragupta v2.1.0 tries to close that gap by distributing the cluster specific costs to individual topics.\nThis post is about what it actually takes to distribute a Kafka bill to individual topics without losing a cent.\nThe metrics that matter # You can\u0026rsquo;t just split a cluster bill evenly across topics. A topic carrying 80% of the bytes should carry 80% of the cost, not 1/N. The moment one team sees the bill, they notice. Usage-based attribution means metrics, and Confluent Cloud exposes three relevant ones through its Prometheus scrape endpoint:\n_DEFAULT_METRIC_NAMES = { \u0026#34;topic_bytes_in\u0026#34;: \u0026#34;confluent_kafka_server_received_bytes\u0026#34;, \u0026#34;topic_bytes_out\u0026#34;: \u0026#34;confluent_kafka_server_sent_bytes\u0026#34;, \u0026#34;topic_retained_bytes\u0026#34;: \u0026#34;confluent_kafka_server_retained_bytes\u0026#34;, } _QUERY_TEMPLATES = { \u0026#34;topic_bytes_in\u0026#34;: \u0026#34;sum by (kafka_id, topic) ({metric_name}{})\u0026#34;, \u0026#34;topic_bytes_out\u0026#34;: \u0026#34;sum by (kafka_id, topic) ({metric_name}{})\u0026#34;, \u0026#34;topic_retained_bytes\u0026#34;: \u0026#34;sum by (kafka_id, topic) ({metric_name}{})\u0026#34;, } Each Confluent product category maps to one or more of these:\nNetworking (write): bytes_in Networking (read): bytes_out Storage: retained_bytes CKU (compute): bytes_in + bytes_out combined Partition count / base infra: no per-topic signal, always even-split All three queries share the same sum by (kafka_id, topic) shape. That\u0026rsquo;s a label aggregation: it collapses every label except kafka_id and topic into one value per topic per scrape time. It does not reduce across time. Each query runs in range mode, so what comes back is a series of points per topic across the billing window.\nThe time-axis reduction happens in application code after the fetch, and it\u0026rsquo;s different per metric. bytes_in and bytes_out are delta gauges: each scrape is the value for the reporting interval since the previous one, so summing every scrape across the window gives you the total bytes for the window. retained_bytes is a point-in-time gauge: each scrape is the current retained bytes, so the system takes the max across the window. Mixing the two reductions produces garbage attribution.\nThat said, the max reduction for retained_bytes is itself a time-slice approximation, not a clean answer. A topic that holds 100 GB for one hour and then drops to zero for the next twenty-three looks identical to a topic that holds 100 GB steadily for a full day, even though the second one accrued twenty-four times more storage cost. It over-attributes spiky topics and under-attributes sustained ones. Whether it actually matters depends on how the upstream provider meters storage billing (peak retained bytes vs. time-integrated GB-hours). I have thoughts on switching this to a time-integrated reduction once I\u0026rsquo;ve validated them against real bills. For now, simplicity wins. Consider this a known limitation.\nThe metric names and the product-type-to-method mapping are both defaults, not hardcoded. metric_name_overrides lets you point at a different exporter naming convention. cost_mapping_overrides lets you swap the method per product type, or disable one entirely. If you need an extension point that isn\u0026rsquo;t there, open an issue. Happy to wire it in.\nOnce it\u0026rsquo;s all wired up, the daily breakdown by product category looks like this:\nEach color is a product category. Each bar is a day. Every dollar in the bars on the right is now traceable down to the topic that generated it.\nLook at the left half of the chart. I hadn\u0026rsquo;t wired up the Prometheus scrape target in this environment until early February. No metrics source means nothing for the topic attribution overlay to query, so the metric-driven product types produced no rows at all. The one category that survives the gap is KAFKA_BASE: it\u0026rsquo;s bound to even-split by definition, no metrics needed, because a cluster-fixed cost has no per-topic signal to begin with. Once the scrape went online, the rest of the bill became traceable.\nThat\u0026rsquo;s not the only failure mode. We\u0026rsquo;ll come back to what happens when Prometheus is reachable but a topic has no data, and what happens when Prometheus stops answering at all.\nThe math # For each topic, divide its bytes by total bytes, multiply by the cluster cost. A topic at 40% of bytes_in gets 40% of the networking-write cost. You multiply, you assign, you move on. Except when you run into the remainder problem. I ran into this pretty early on with the Chargeback implementation and I reused the same logic to split the cost here as well.\n$10.00 / 3 topics is $3.333... repeating. Python\u0026rsquo;s Decimal at four places of precision rounds each topic to $3.3333. Three of those sum to $9.9999. You just lost a hundredth of a cent. Do that across 50 clusters every day and finance notices.\nThe fix was to simply track the gap and redistribute it:\n_CENT = Decimal(\u0026#34;0.0001\u0026#34;) def _distribute_remainder(amounts, diff): \u0026#34;\u0026#34;\u0026#34;Distribute rounding remainder one cent at a time, round-robin.\u0026#34;\u0026#34;\u0026#34; if diff == 0: return amounts step = _CENT if diff \u0026gt; 0 else -_CENT idx = 0 for _ in range(len(amounts) * 2): amounts[idx] += step diff -= step idx = (idx + 1) % len(amounts) if diff == 0: break After rounding each amount, compute the difference from the expected total, then hand out the residual one cent at a time until it\u0026rsquo;s gone. The first few topics get a fraction more, but at $0.0001 per cent the bias is too small to argue about. The sum always matches the input. This is the \u0026ldquo;no money silently lost\u0026rdquo; invariant. Every test in the topic attribution suite asserts sum(amounts) == cluster_cost.\nAnd here\u0026rsquo;s the payoff:\nEach rectangle is a topic. Each rectangle\u0026rsquo;s area is its share of the bill. The handful of large topics on the left are exactly the ones the platform team wants to have a conversation about. The hundreds of small ones on the right are noise that previously got bundled into \u0026ldquo;platform overhead.\u0026rdquo;\nThe edge cases # This is the part that takes the most code.\nA topic has zero metric data. Prometheus answered the query but returned nothing for that topic. Maybe it had no traffic during the window, maybe its metrics never made it through. The fallback chain catches this: if the usage-ratio model returns nothing, a missing-metrics fallback kicks in and either even-splits across the zero-usage topics or skips them entirely, based on config:\ntopic_attribution: missing_metrics_behavior: even_split # or skip The choice the system makes for each row is recorded as a first-class column in the attribution data:\nNotice the Method column. bytes_ratio is the usage-ratio path firing on bytes_in/bytes_out data. retained_bytes_ratio is the same path firing on the storage gauge. even_split shows up in two distinct situations. For KAFKA_BASE and KAFKA_PARTITION, it\u0026rsquo;s the only model: those are cluster-fixed costs with no per-topic metric, so they get split equally by definition. For the metric-driven types (KAFKA_NETWORK_*, KAFKA_STORAGE, KAFKA_NUM_CKU), even_split only shows up as the fallback when the underlying metric was missing for that period. Either way, every row carries its own attribution method, so when finance asks \u0026ldquo;why did this topic get this number?\u0026rdquo; the answer is one column away.\nA topic appeared mid-window. It\u0026rsquo;s in the Prometheus metrics but not in the resources table at window start, or the other way around. Chitragupta takes the union of both sources. Topics from the resources table during [b_start, b_end) get included. Topics that showed up in any metric result get included too. Union catches both directions: topics that existed but had no traffic (and fall through to even-split), and topics that showed up in metrics but weren\u0026rsquo;t yet tracked in the resource catalog.\nA topic was deleted before the window. The resource query is point-in-time, scoped to the billing window. Deleted topics naturally drop out.\nThe sentinel pattern: when Prometheus is down # You\u0026rsquo;ve computed the chargeback, you\u0026rsquo;re ready to distribute costs to topics, and Prometheus throws a 503. Now what?\nThis is different from the missing-metrics case above. When Prometheus answered the query and returned nothing, the absence of data was a signal: those topics genuinely had no traffic, and even-splitting is honest. When Prometheus doesn\u0026rsquo;t answer at all, you have no signal. There\u0026rsquo;s nothing to fall back to.\nSo you have three bad options:\nDrop the billing line. You lose the cost entirely. Never acceptable. Fall back to even-split anyway. You paper over the outage with a guess. Your attribution numbers look clean but they\u0026rsquo;re invented, and nothing in the data tells you that. Retry forever. Your pipeline stalls on a permanent outage. Chitragupta does none of these. The pattern I went with is: retry up to a configurable limit (default 3, range 1 to 10), and if Prometheus is still unreachable after the last attempt, write a sentinel row:\nTopicAttributionRow( topic_name=\u0026#34;__UNATTRIBUTED__\u0026#34;, attribution_method=\u0026#34;ATTRIBUTION_FAILED\u0026#34;, amount=Decimal(str(line.total_cost)), # full cost preserved metadata={ \u0026#34;error\u0026#34;: \u0026#34;Prometheus metrics permanently unavailable\u0026#34;, \u0026#34;cluster_id\u0026#34;: cluster_id, \u0026#34;topic_attribution_attempts\u0026#34;: attempts, }, ) The full cluster cost sits in a single row with a synthetic topic name and an explicit attribution method. When you query topic attribution later, you can filter out ATTRIBUTION_FAILED rows for per-topic views, or include them to reconcile against the cluster bill.\nNo money is silently lost. Nothing is silently faked. The outage becomes a first-class row in the data, with enough metadata to find the cluster, count the attempts, and page someone.\nThe retry counter lives on the billing line itself. Every run that can\u0026rsquo;t reach Prometheus increments it. Once every line for the date is at the limit, the whole date resolves to sentinel rows and the pipeline moves on. Until then, the date stays pending and gets retried on the next run. Partial failures don\u0026rsquo;t mark the date as calculated, so a bad cluster can\u0026rsquo;t poison a good one.\nZombie topics: the bonus question # Once you have per-topic cost, a new question becomes answerable: which topics are you paying to store for no reason? These are the topics that should have been potentially deleted months ago. Someone ran an experiment, abandoned it, left the topic behind, and you\u0026rsquo;ve been quietly paying to retain its data ever since. Nobody notices because the storage cost for any one topic is tiny. Add up 400 of them and it isn\u0026rsquo;t.\nChitragupta\u0026rsquo;s analytics tab calls these \u0026ldquo;Zombie Topic Candidates\u0026rdquo; and the logic is about as simple as it gets:\nconst ZOMBIE_THRESHOLD = 0.01; topics .filter((t) =\u0026gt; t.kafka_storage \u0026gt; 0 \u0026amp;\u0026amp; t.total_network \u0026lt; ZOMBIE_THRESHOLD) .sort((a, b) =\u0026gt; b.kafka_storage - a.kafka_storage); Non-zero storage cost (you\u0026rsquo;re paying to keep it), near-zero network cost (nobody is touching it). Sort by storage descending, and the most expensive zombies float to the top.\nFive topics on this page alone, $133 of storage cost between them, zero traffic. None of them should still exist.\nThis is twenty lines of frontend code, not a backend feature. It works because the topic attribution table already has per-topic per-product-type cost rows. Ask a new question with a filter and a sort, and the answer falls out. I\u0026rsquo;m a big proponent of this pattern: build the data layer right, and new features come comparatively cheap. More on how that data layer is structured in the next post.\nBy the way, this is not to say that all the topics that will show up on this list are actually zombies. Some of them might be legitimate topics that are just not being used at the moment. But it\u0026rsquo;s a good starting point to identify topics that are worth investigating further. Don\u0026rsquo;t go in and burn the place down, just cuz a dashboard told you to.\nWhat\u0026rsquo;s next # Topic-level attribution ships in Chitragupta v2.1.0. The minimum config to turn it on, assuming you already have CCloud chargebacks running:\nplugin_settings: metrics: type: prometheus url: \u0026#34;${PROMETHEUS_URL}\u0026#34; topic_attribution: enabled: true missing_metrics_behavior: even_split One new block. The metrics block already existed for the chargeback codebase and is marked mandatory for topic attribution to work. It points at any Prometheus that scrapes the Confluent Cloud Metrics API. The topic_attribution is the new block that flips the overlay on. I default to even_split for missing_metrics_behavior because I\u0026rsquo;d rather fill a gap with a defensible guess than stare at a blank cell, but skip is there too if you\u0026rsquo;d rather see the holes. Pick your poison. Full field reference is in docs/configuration/ccloud-reference.md.\nThe design decisions in this post aren\u0026rsquo;t the glamorous part of the feature. \u0026ldquo;I added a retry counter\u0026rdquo; doesn\u0026rsquo;t show well in release notes. But the sentinel pattern, the remainder distribution, and the union-of-topics strategy are what separate a demo from something you can actually trust your finance team to look at.\nTry it. Tell me what breaks. Feel free to reach out.\nNext in the series: Star Schema for Operational Data, on how the storage layer holds all of this together and why topic attribution gets its own dimensions and facts tables instead of sharing the chargeback schema.\n","date":"8 April 2026","externalUrl":null,"permalink":"/blog/topic-level-kafka-cost-attribution/","section":"Blog","summary":"“Which topic is costing us?” is the question every platform team gets, and nobody can answer it cleanly. SaaS providers bill you at the cluster level. Your finance team wants a line item per team. Your product managers want to know if their pipeline is the expensive one. You have a Kafka cluster, a bill, and no bridge between them.\n","title":"Topic-Level Kafka Cost Attribution with Prometheus","type":"blog"},{"content":"","date":"4 April 2026","externalUrl":null,"permalink":"/categories/general/","section":"Categories","summary":"","title":"General","type":"categories"},{"content":"I\u0026rsquo;ve spent 17 years building streaming platforms, integration systems, and open-source tooling. A lot of what I\u0026rsquo;ve learned lives in conference slides, GitHub repos, and conversations that are hard to find again. This blog is where I put it all in one place.\nWhat to expect # Kafka internals and production patterns. Not the getting-started guide. The stuff that actually matters when you\u0026rsquo;re running clusters at scale. Observability. What signals to pay attention to when your Kafka cluster looks fine on the surface but isn\u0026rsquo;t. FinOps and chargebacks. Who used what, how much it cost, and how to make that visible across teams. Tooling. I build tools (Chitragupta, Kafka Shepherd, broker sizing calculators) because the existing options didn\u0026rsquo;t solve the problem well enough. I\u0026rsquo;ll write about why they exist and how they work. Lessons learned. Mostly the hard way. Why now? # I\u0026rsquo;ve given talks at Current and Conf42, built open-source tools that people actually use, and I keep running into the same questions. Writing things down once makes more sense than explaining them over and over.\nThere\u0026rsquo;s also just the urge to share what I learn. That\u0026rsquo;s been a constant through my career, and a blog is the right format for it.\nFirst real post is on the way.\n","date":"4 April 2026","externalUrl":null,"permalink":"/blog/hello-world/","section":"Blog","summary":"I’ve spent 17 years building streaming platforms, integration systems, and open-source tooling. A lot of what I’ve learned lives in conference slides, GitHub repos, and conversations that are hard to find again. This blog is where I put it all in one place.\n","title":"Hello World","type":"blog"},{"content":"","date":"4 April 2026","externalUrl":null,"permalink":"/tags/meta/","section":"Tags","summary":"","title":"Meta","type":"tags"},{"content":"","date":"26 March 2026","externalUrl":null,"permalink":"/tags/observability/","section":"Tags","summary":"","title":"Observability","type":"tags"},{"content":"","date":"26 March 2026","externalUrl":null,"permalink":"/tags/open-source/","section":"Tags","summary":"","title":"Open-Source","type":"tags"},{"content":"","date":"26 March 2026","externalUrl":null,"permalink":"/projects/","section":"Projects","summary":"","title":"Projects","type":"projects"},{"content":"","date":"29 October 2025","externalUrl":null,"permalink":"/tags/conference/","section":"Tags","summary":"","title":"Conference","type":"tags"},{"content":"Presented at Confluent Current 2025 in New Orleans.\nCloud billing for Kafka is opaque by default. You get one invoice, shared across every team, with no clear way to tell who\u0026rsquo;s driving costs. This talk walks through how to go from that to transparent, identity-level cost attribution, covering the tooling, the data model, and the organizational patterns that make chargebacks and showbacks actually work.\nWatch the recording\n","date":"29 October 2025","externalUrl":null,"permalink":"/talks/current-2025-chargebacks/","section":"Talks \u0026 Publications","summary":"Presented at Confluent Current 2025 in New Orleans.\nCloud billing for Kafka is opaque by default. You get one invoice, shared across every team, with no clear way to tell who’s driving costs. This talk walks through how to go from that to transparent, identity-level cost attribution, covering the tooling, the data model, and the organizational patterns that make chargebacks and showbacks actually work.\n","title":"From 'Where's My Money?' to 'Here's Your Bill': Demystifying Kafka Chargebacks and Showbacks","type":"talks"},{"content":"","date":"29 October 2025","externalUrl":null,"permalink":"/categories/talks/","section":"Categories","summary":"","title":"Talks","type":"categories"},{"content":"","date":"29 October 2025","externalUrl":null,"permalink":"/talks/","section":"Talks \u0026 Publications","summary":"","title":"Talks \u0026 Publications","type":"talks"},{"content":"Presented at Conf42 Observability 2025.\nKafka doesn\u0026rsquo;t fail cleanly. It stalls, lags, and misfires beneath the surface. This talk cuts through the noise to show what signals actually matter, how to catch issues early, and how to make Kafka observable without drowning in metrics.\n","date":"5 June 2025","externalUrl":null,"permalink":"/talks/conf42-observability-2025/","section":"Talks \u0026 Publications","summary":"Presented at Conf42 Observability 2025.\nKafka doesn’t fail cleanly. It stalls, lags, and misfires beneath the surface. This talk cuts through the noise to show what signals actually matter, how to catch issues early, and how to make Kafka observable without drowning in metrics.\n","title":"Observability-First Kafka: Engineering Visibility at Scale","type":"talks"},{"content":"Featured on Confluent\u0026rsquo;s Life Is But A Stream podcast, February 2025.\nThe batch vs. streaming question comes up constantly with teams evaluating Kafka. This episode breaks down the architectural trade-offs, when each approach makes sense, and what the transition from batch to real-time actually looks like in practice.\n","date":"18 February 2025","externalUrl":null,"permalink":"/talks/confluent-podcast-batch-vs-streaming/","section":"Talks \u0026 Publications","summary":"Featured on Confluent’s Life Is But A Stream podcast, February 2025.\nThe batch vs. streaming question comes up constantly with teams evaluating Kafka. This episode breaks down the architectural trade-offs, when each approach makes sense, and what the transition from batch to real-time actually looks like in practice.\n","title":"Batch Processing vs. Real-Time Stream Processing with Apache Flink","type":"talks"},{"content":"","date":"18 February 2025","externalUrl":null,"permalink":"/tags/flink/","section":"Tags","summary":"","title":"Flink","type":"tags"},{"content":"","date":"18 February 2025","externalUrl":null,"permalink":"/tags/podcast/","section":"Tags","summary":"","title":"Podcast","type":"tags"},{"content":"","date":"18 February 2025","externalUrl":null,"permalink":"/tags/streaming/","section":"Tags","summary":"","title":"Streaming","type":"tags"},{"content":"","date":"1 May 2022","externalUrl":null,"permalink":"/tags/grafana/","section":"Tags","summary":"","title":"Grafana","type":"tags"},{"content":"","date":"29 November 2021","externalUrl":null,"permalink":"/tags/book/","section":"Tags","summary":"","title":"Book","type":"tags"},{"content":"","date":"29 November 2021","externalUrl":null,"permalink":"/tags/confluent/","section":"Tags","summary":"","title":"Confluent","type":"tags"},{"content":"Co-author and section reviewer.\nA comprehensive adoption guide for technical executives, platform owners, and project managers evaluating or rolling out Confluent. Covers architecture patterns, operational considerations, and the organizational side of moving to event-driven platforms.\nRead the guide (PDF) Blog announcement ","date":"29 November 2021","externalUrl":null,"permalink":"/talks/setting-data-in-motion-book/","section":"Talks \u0026 Publications","summary":"Co-author and section reviewer.\nA comprehensive adoption guide for technical executives, platform owners, and project managers evaluating or rolling out Confluent. Covers architecture patterns, operational considerations, and the organizational side of moving to event-driven platforms.\n","title":"Setting Data in Motion: The Definitive Guide to Adopting Confluent","type":"talks"},{"content":"Published on the Confluent Blog, March 2021.\nSelf-managing a Kafka cluster means wiring up your own monitoring. This post walks through how to export JMX data from Confluent clusters into Prometheus and Grafana with minimal setup. It became the reference blog series for connecting Confluent ecosystems to Prometheus-based monitoring stacks.\n","date":"29 March 2021","externalUrl":null,"permalink":"/talks/confluent-blog-prometheus-grafana/","section":"Talks \u0026 Publications","summary":"Published on the Confluent Blog, March 2021.\nSelf-managing a Kafka cluster means wiring up your own monitoring. This post walks through how to export JMX data from Confluent clusters into Prometheus and Grafana with minimal setup. It became the reference blog series for connecting Confluent ecosystems to Prometheus-based monitoring stacks.\n","title":"Monitor Kafka Clusters with Prometheus, Grafana, and Confluent","type":"talks"},{"content":"","date":"20 September 2019","externalUrl":null,"permalink":"/tags/capacity-planning/","section":"Tags","summary":"","title":"Capacity-Planning","type":"tags"},{"content":"Complex systems, practical tooling, and the urge to share what I learn. Years of building things that make distributed infrastructure less painful.\nWhat I do # I\u0026rsquo;m a Principal Customer Success Technical Architect at Confluent. I work with enterprise customers on Kafka architecture, design, and production operations.\nOn the side, I build open-source tools. Chitragupta handles infrastructure cost chargebacks. JMX Monitoring Stacks is the observability quickstart for Confluent Platform and Cloud. I\u0026rsquo;ve spoken at conferences like Current and Conf42, contributed to books, and generally spend a lot of time thinking about how to make distributed systems observable and accountable.\nBackground # Before Confluent, I spent years in the integration space working with TIBCO, MuleSoft, and other platforms. Everything from CRM migration interfaces to real-time monitoring systems. The common thread has always been the same: making complex things work reliably in production and building tools so others can do the same.\nGet in touch # GitHub LinkedIn ","externalUrl":null,"permalink":"/about/","section":"About Me","summary":"Complex systems, practical tooling, and the urge to share what I learn. Years of building things that make distributed infrastructure less painful.\nWhat I do # I’m a Principal Customer Success Technical Architect at Confluent. I work with enterprise customers on Kafka architecture, design, and production operations.\n","title":"About Me","type":"about"},{"content":"","externalUrl":null,"permalink":"/series/","section":"Series","summary":"","title":"Series","type":"series"}]