Skip to content

Monitoring & Observability

Cosmictron exposes a Prometheus scrape endpoint at :9090/metrics (configurable).

MetricTypeDescription
cosmictron_reducer_calls_totalCounterTotal reducer invocations, labeled by module, reducer, status
cosmictron_reducer_duration_secondsHistogramReducer execution latency
cosmictron_reducer_fuel_usedHistogramFuel consumed per reducer call
cosmictron_wal_writes_totalCounterWAL write operations
cosmictron_wal_bytes_written_totalCounterBytes written to WAL
cosmictron_wal_fsync_duration_secondsHistogramfsync latency
cosmictron_subscription_countGaugeActive WebSocket subscriptions
cosmictron_subscription_delta_rateGaugeRows/s delivered to subscribers
cosmictron_storage_pages_totalGaugeTotal pages in table store
cosmictron_storage_bytes_usedGaugeBytes used in data directory
cosmictron_auth_requests_totalCounterAuth requests by type and status
cosmictron_connections_activeGaugeActive WebSocket connections
cosmictron_compliance_signatures_totalCounterEvents signed (if signing enabled)
cosmictron_compliance_tsa_requests_totalCounterTSA timestamp requests (if enabled)
scrape_configs:
- job_name: cosmictron
static_configs:
- targets: ['cosmictron-host:9090']
scrape_interval: 15s

Cosmictron emits traces for all reducer calls, queries, and subscription events via OTLP.

[telemetry]
otlp_endpoint = "http://otel-collector:4317"

Trace attributes:

  • cosmictron.reducer — reducer name
  • cosmictron.module — module name
  • cosmictron.identity — caller identity hash
  • cosmictron.tx_id — transaction ID
  • db.system = cosmictron

Log format: JSON (production) or pretty (development).

[telemetry]
log_level = "info" # trace | debug | info | warn | error
log_format = "json" # "json" | "pretty"

Environment override:

Terminal window
COSMICTRON_LOG=debug,cosmictron_wal=trace

Standard Rust env-filter format is supported for fine-grained control per module.

EndpointDescription
GET /v1/healthBasic liveness — returns 200 OK if the process is running
GET /v1/health/readyReadiness — returns 200 OK if WAL is initialized and storage is healthy
GET /v1/health/detailedFull diagnostics — WAL status, module health, subscription count

Example response from /v1/health/detailed:

{
"status": "healthy",
"wal": { "status": "ok", "pending_segments": 0 },
"storage": { "status": "ok", "pages": 120450, "bytes_used": 985040896 },
"modules": [
{ "name": "my-agent", "status": "active", "reducers": 5 }
],
"subscriptions": { "active": 47 },
"uptime_secs": 86400
}
AlertConditionSeverity
High reducer error raterate(cosmictron_reducer_calls_total{status="error"}[5m]) > 0.05Warning
WAL fsync latencycosmictron_wal_fsync_duration_seconds{p99} > 0.5Warning
Storage near fullcosmictron_storage_bytes_used / disk_total > 0.85Critical
No livenessup{job="cosmictron"} == 0Critical
TSA failuresrate(cosmictron_compliance_tsa_requests_total{status="error"}[15m]) > 0Warning