Confluent Cloud Configuration Reference¶

New to Confluent Cloud configuration?

Read the Configuration Guide first for a walkthrough of the decisions you'll make, then come back here for the full field reference.

ecosystem key¶

ecosystem: confluent_cloud

Full example¶

tenants:
  my-ccloud-org:
    ecosystem: confluent_cloud
    tenant_id: my-ccloud-org       # internal partition key (not the CCloud org ID)
    lookback_days: 200
    cutoff_days: 5
    retention_days: 250
    storage:
      connection_string: "sqlite:///data/ccloud.db"
    plugin_settings:
      ccloud_api:
        key: ${CCLOUD_API_KEY}
        secret: ${CCLOUD_API_SECRET}
      billing_api:
        days_per_query: 15
      metrics:
        type: prometheus
        url: https://api.telemetry.confluent.cloud
        auth_type: basic
        username: ${METRICS_API_KEY}
        password: ${METRICS_API_SECRET}
      flink:
        - region_id: us-east-1
          key: ${FLINK_API_KEY}
          secret: ${FLINK_API_SECRET}
      emitters:
        - type: csv
          aggregation: daily
          params:
            output_dir: ./output
      chargeback_granularity: daily
      topic_attribution:
        enabled: true
        exclude_topic_patterns:
          - "__consumer_offsets"
          - "_schemas"
          - "_confluent-*"
        missing_metrics_behavior: even_split
        retention_days: 90

TenantConfig fields¶

Field	Type	Default	Description
`ecosystem`	string	required	Must be `confluent_cloud`
`tenant_id`	string	required	Unique partition key for DB records. Can be any string (e.g. `prod`, `acme-corp`). This is not your Confluent Cloud Organization ID — it is an internal label used to isolate data across tenants in the database.
`lookback_days`	int	200	Days of billing history to fetch (max 364). Must be > `cutoff_days`.
`cutoff_days`	int	5	Skip dates within this many days of today (billing lag, max 30)
`retention_days`	int	250	Delete data older than this (max 730)
`allocation_retry_limit`	int	3	Max identity resolution retries before fallback (max 10)
`topic_attribution_retry_limit`	int	3	Max Prometheus fetch retries per cluster before producing sentinel rows (1–10). See Topic attribution retry behavior.
`gather_failure_threshold`	int	5	Consecutive gather failures before tenant suspension
`tenant_execution_timeout_seconds`	int	3600	Per-tenant run timeout (0 = no timeout)
`metrics_prefetch_workers`	int	4	Parallel metrics query threads (1–20)
`zero_gather_deletion_threshold`	int	-1	Mark resources deleted after N zero-gather cycles (-1 = disabled)

plugin_settings fields (CCloud)¶

Field	Type	Default	Description
`ccloud_api.key`	string	required	CCloud API key
`ccloud_api.secret`	secret	required	CCloud API secret
`billing_api.days_per_query`	int	15	Days per billing API request (max 30)
`metrics.url`	string	optional	Prometheus/Telemetry API URL
`metrics.auth_type`	enum	`none`	`basic`, `bearer`, or `none`
`metrics.username`	string	optional	For `auth_type: basic`
`metrics.password`	secret	optional	For `auth_type: basic`
`metrics.bearer_token`	secret	optional	For `auth_type: bearer`
`flink`	list	optional	Per-region Flink API credentials
`chargeback_granularity`	enum	`daily`	`hourly`, `daily`, or `monthly`
`metrics_step_seconds`	int	3600	Prometheus query step (lower = finer granularity)
`min_refresh_gap_seconds`	int	1800	Minimum time between pipeline runs for this tenant

Handled product types¶

Each product type from the CCloud billing API is routed to a handler that knows how to resolve identities and allocate costs for that service. The allocation strategy reflects the nature of the cost — usage-driven costs are split by measured consumption, shared costs are split evenly.

Handler	Product types	Allocation strategy	Why
`kafka`	`KAFKA_NUM_CKU`, `KAFKA_NUM_CKUS`	Hybrid: 70% usage ratio (bytes), 30% even split	CKUs are the main Kafka compute cost. Part of the cost is driven by traffic volume (usage), part is base infrastructure overhead (shared). The 70/30 default is configurable via `allocator_params`.
`kafka`	`KAFKA_NETWORK_READ`, `KAFKA_NETWORK_WRITE`	Usage ratio (bytes in/out per principal)	Network transfer is directly attributable to the principal that produced or consumed the data. Requires Telemetry API metrics.
`kafka`	`KAFKA_BASE`, `KAFKA_PARTITION`, `KAFKA_STORAGE`	Even split	Base fees, partition counts, and storage are cluster-level costs with no per-principal usage metric.
`schema_registry`	`SCHEMA_REGISTRY`, `GOVERNANCE_BASE`, `NUM_RULES`	Even split	Schema Registry is a shared service — all principals benefit equally from schema validation.
`connector`	`CONNECT_CAPACITY`, `CONNECT_NUM_TASKS`, `CONNECT_THROUGHPUT`, `CUSTOM_CONNECT_*`	Even split per connector	Connectors are typically owned by teams. Costs are split among identities active on the connector's resource.
`ksqldb`	`KSQL_NUM_CSU`, `KSQL_NUM_CSUS`	Even split	ksqlDB compute units are application-level — split across active identities.
`flink`	`FLINK_NUM_CFU`, `FLINK_NUM_CFUS`	Usage ratio by statement owner CFU consumption	Flink CFU costs are directly traceable to the user who created the SQL statement. Uses a 4-tier chain: statement owner → active identities → period identities → resource.
`org_wide`	`AUDIT_LOG_READ`, `SUPPORT`	Even split across tenant, then to UNALLOCATED	Org-wide costs have no resource or principal — they apply to the whole organization.
`default`	`TABLEFLOW_*`	Shared (to resource)	New product types without a dedicated handler fall back to resource-level allocation.
`default`	`CLUSTER_LINKING_*`	Usage (to resource)	Cluster linking costs are attributed to the linked resource.

Unknown product types are allocated to UNALLOCATED. Check the allocation_detail field on chargeback rows to understand which fallback tier was used.

See How Costs Work for the complete allocation model including the fallback chain and composite CKU allocation.

Allocator params¶

Override default allocation ratios for Kafka CKU costs:

allocator_params:
  kafka_cku_usage_ratio: 0.70   # fraction allocated by bytes (default 0.70)
  kafka_cku_shared_ratio: 0.30  # fraction allocated evenly (default 0.30)

kafka_cku_usage_ratio + kafka_cku_shared_ratio must sum to 1.0 (tolerance: 0.0001). Startup fails if they don't.

How to think about the ratio

The usage portion is allocated proportionally to bytes_in + bytes_out per principal. The shared portion is split evenly across all active identities.

High usage ratio (0.90/0.10): Heavy producers/consumers pay proportionally more. Good when your cluster is right-sized and traffic volume drives cost.
Balanced (0.70/0.30): Default. Acknowledges that the cluster has a base cost regardless of traffic.
High shared ratio (0.50/0.50): Spreads cost more evenly. Good when the cluster is over-provisioned and most cost is fixed overhead.

If metrics are unavailable for a billing window, the usage portion falls back to even-split anyway — so at 1.0/0.0, you effectively get even-split when Telemetry API data is missing. See How Costs Work for a worked example.

Topic attribution¶

Topic attribution is an optional overlay stage that breaks Kafka cluster costs down to individual topics using Prometheus metrics. It runs after chargeback calculation and writes results to a separate star-schema table.

Prerequisite: Topic-level CCloud metrics (received_bytes, sent_bytes, retained_bytes per topic) must be scraped into the Prometheus instance configured under plugin_settings.metrics.

topic_attribution:
  enabled: true                           # off by default
  exclude_topic_patterns:
    - "__consumer_offsets"                # default exclusions
    - "_schemas"
    - "_confluent-*"
  missing_metrics_behavior: even_split   # even_split | skip
  retention_days: 90                     # 1–365, independent of tenant retention_days
  cost_mapping_overrides:                # override per product type
    KAFKA_PARTITION: even_split
    KAFKA_BASE: disabled
  metric_name_overrides:                 # override Prometheus metric names
    topic_bytes_in: custom_received_bytes
  emitters: []                           # same format as top-level emitters

Requires metrics to be configured

topic_attribution.enabled: true requires a metrics section in plugin_settings. Config validation rejects the combination of enabled: true and no metrics source — startup fails with a ValidationError rather than silently producing even-split attribution from zero data.

When Chitragupta starts in API mode without full plugin validation (e.g., using a raw config that hasn't been fully parsed as CCloudPluginConfig), this misconfiguration is surfaced at API query time instead of startup. In that case, GET /api/v1/tenants and GET /api/v1/readiness return topic_attribution_status: "config_error" with a human-readable topic_attribution_error message. The frontend displays a red "Config error" badge in the sidebar and a red alert on the Topic Attribution page. Correct the config and restart to resolve.

`topic_attribution` fields¶

Field	Type	Default	Description
`enabled`	bool	`false`	Enable the topic overlay stage
`exclude_topic_patterns`	list[string]	`["__consumer_offsets", "_schemas", "_confluent-*"]`	Glob patterns for topics to skip. Matched with `fnmatch`.
`missing_metrics_behavior`	enum	`even_split`	What to do when metrics are all-zero or unavailable: `even_split` distributes evenly; `skip` omits the cluster from output.
`retention_days`	int	`90`	Days to keep topic attribution rows (1–365). Independent of the tenant-level `retention_days`.
`cost_mapping_overrides`	dict[string, string]	`{}`	Override the attribution method per CCloud product type. Valid methods: `bytes_ratio`, `retained_bytes_ratio`, `even_split`, `disabled`.
`metric_name_overrides`	dict[string, string]	`{}`	Override Prometheus metric names. Valid keys: `topic_bytes_in`, `topic_bytes_out`, `topic_retained_bytes`.
`emitters`	list	`[]`	Emitter specs for topic attribution output. Same format as top-level `emitters`. For `type: csv`, `output_dir` defaults to `/tmp/topic_attribution` and `filename_template` defaults to `topic_attr_{tenant_id}_{date}.csv` — both are overridable per-spec.

Default cost mappings¶

Product type	Attribution method	Metric used
`KAFKA_NETWORK_WRITE`	bytes_ratio	`topic_bytes_in`
`KAFKA_NETWORK_READ`	bytes_ratio	`topic_bytes_out`
`KAFKA_STORAGE`	retained_bytes_ratio	`topic_retained_bytes`
`KAFKA_PARTITION`	even_split	—
`KAFKA_BASE`	even_split	—
`KAFKA_NUM_CKU` / `KAFKA_NUM_CKUS`	bytes_ratio	`topic_bytes_in` + `topic_bytes_out`

For bytes_ratio product types, if Prometheus returns all-zero values, the missing_metrics_behavior setting determines the fallback.

Topic attribution retry behavior¶

When Prometheus is unreachable for a cluster, the pipeline leaves that date pending and retries on the next run. The per-billing-line attempt count is stored in billing.topic_attribution_attempts (migration 016).

Once all billing lines for a cluster reach topic_attribution_retry_limit attempts without a successful metrics fetch, the pipeline produces sentinel rows instead of retrying indefinitely:

topic_name: __UNATTRIBUTED__
attribution_method: ATTRIBUTION_FAILED
amount: full cluster cost (no money is silently lost)

The date is marked calculated only when every cluster is resolved — either with real attribution rows, empty metrics, or sentinel rows. Clusters still below the retry limit return pending and do not block other clusters from being marked.

To identify sentinel rows in the database or API:

SELECT * FROM topic_attribution_facts
JOIN topic_attribution_dimensions USING (dimension_id)
WHERE attribution_method = 'ATTRIBUTION_FAILED';

Emitters¶

emitters:
  - type: csv
    aggregation: daily       # rows aggregated to daily before writing
    params:
      output_dir: /data/csv
      filename_template: "{tenant_id}_{date}.csv"

aggregation options: null (as-is), hourly, daily, monthly.