Skip to content

Advanced Scenarios

Multiple tenants

Each tenant must have a unique storage.connection_string. Shared databases are rejected at startup.

tenants:
  team-alpha:
    storage:
      connection_string: "sqlite:///data/alpha.db"
    ...
  team-beta:
    storage:
      connection_string: "sqlite:///data/beta.db"
    ...

Custom granularity durations

Override chargeback period length (minimum 1 hour):

plugin_settings:
  granularity_durations:
    "4h": 4
    "weekly": 168

Allocator overrides

Replace a built-in allocator for a specific product type:

plugin_settings:
  allocator_overrides:
    KAFKA_NETWORK_READ: mymodule.custom_allocator

The value is a dotted import path to a callable matching the CostAllocator protocol.

Identity resolution overrides

Replace identity resolution for a specific product type:

plugin_settings:
  identity_resolution_overrides:
    KAFKA_NETWORK_READ: mymodule.custom_resolver

Validation constraints

These cross-field constraints are enforced at startup:

Constraint Error if violated
lookback_days must be > cutoff_days lookback_days must be > cutoff_days
CKU ratios must sum to 1.0 (CCloud) kafka_cku_usage_ratio + kafka_cku_shared_ratio must equal 1.0
Each tenant must have a unique storage.connection_string Names the conflicting tenants
discovery_query required when identity_source.source is prometheus or both discovery_query required

Tuning parameters

These TenantConfig fields have sensible defaults but can be overridden. See the Configuration Guide — Pipeline Tuning for guidance on when and how to adjust these.

Field Type Default Description When to change
metrics_prefetch_workers int 4 Parallel threads for metrics queries (1–20) Increase for 100+ resources with fast Prometheus. Decrease if Prometheus is rate-limited.
zero_gather_deletion_threshold int -1 Mark resources deleted after N consecutive zero-gather cycles (-1 = disabled) Enable (e.g., 3) if you want automatic cleanup of decommissioned resources. Leave disabled if gather cycles are unreliable.
gather_failure_threshold int 5 Consecutive gather failures before tenant is permanently suspended Increase if transient API errors are common (rate limiting, network blips). Decrease to fail fast on bad credentials.
tenant_execution_timeout_seconds int 3600 Per-tenant pipeline run timeout in seconds (0 = no timeout) Increase for large backfills (200+ lookback days on first run). Decrease for alerting on stuck pipelines.
allocation_retry_limit int 3 Max identity resolution retries before allocating to UNALLOCATED (1–10) Increase if identity data arrives with a delay (eventual consistency). Rarely needs changing.
topic_attribution_retry_limit int 3 Max Prometheus fetch retries per cluster before producing sentinel rows (1–10) Increase if your Prometheus endpoint has intermittent outages longer than your pipeline run interval. Decrease to resolve dead clusters faster.

Topic attribution overrides

Cost mapping overrides

Override the attribution method for specific CCloud product types. Valid methods: bytes_ratio, retained_bytes_ratio, even_split, disabled.

topic_attribution:
  enabled: true
  cost_mapping_overrides:
    KAFKA_PARTITION: even_split    # override: split partition costs evenly instead of by bytes
    KAFKA_BASE: disabled           # exclude base costs from topic attribution entirely

See the default cost mappings for the full list of product types and their default methods.

Metric name overrides

If your Prometheus instance uses non-standard metric names for CCloud topic-level metrics:

topic_attribution:
  enabled: true
  metric_name_overrides:
    topic_bytes_in: my_custom_received_bytes
    topic_bytes_out: my_custom_sent_bytes
    topic_retained_bytes: my_custom_retained_bytes

Valid keys: topic_bytes_in, topic_bytes_out, topic_retained_bytes.

API server configuration

api:
  host: 0.0.0.0
  port: 8080                    # 1–65535
  request_timeout_seconds: 30   # 1–300, returns HTTP 504 on timeout
  enable_cors: true
  cors_origins:
    - "https://your-dashboard.example.com"

Metrics authentication

Basic auth

metrics:
  url: https://prometheus.example.com
  auth_type: basic
  username: ${PROM_USER}
  password: ${PROM_PASS}

Bearer token

metrics:
  url: https://prometheus.example.com
  auth_type: bearer
  bearer_token: ${PROM_TOKEN}

Validation rules: basic requires both username and password. bearer requires bearer_token. none rejects any credentials — don't mix auth_type with unrelated credential fields.

Custom plugins path

plugins_path: /opt/custom_plugins

Path is resolved relative to CWD if relative, used as-is if absolute. Each subdirectory is treated as a plugin package. Requirements:

  • Each plugin subdirectory must contain __init__.py
  • __init__.py must expose register() returning (ecosystem_name: str, factory: Callable)
  • No sys.path modification needed — file-based import is used automatically for external paths

If a plugin fails to load, a warning with full traceback is logged (including the reason, e.g. missing __init__.py) and the engine continues loading remaining plugins.

Emitter aggregation

Emitters receive rows aggregated to the requested period:

emitters:
  - type: csv
    aggregation: monthly     # one file per month
    params:
      output_dir: ./monthly-output
  - type: csv
    aggregation: daily       # one file per day (same data, different granularity)
    params:
      output_dir: ./daily-output

Emitters may not request finer granularity than chargeback_granularity produces.

Per-module log levels

logging:
  level: INFO     # CRITICAL | ERROR | WARNING | INFO | DEBUG
  per_module_levels:
    core.metrics.prometheus: DEBUG
    plugins.confluent_cloud: DEBUG