Distributed Orchestration
Current state
Section titled “Current state”Cosmictron is currently a single-node runtime. All state, execution, and subscriptions live within one binary. This design provides:
- Deterministic event ordering without distributed consensus
- No network latency for in-process reducer calls
- Simple operational model — one process, one storage directory
For the majority of workloads (including large-scale AI agent deployments), a single well-provisioned Cosmictron instance handles thousands of concurrent sessions.
Cross-node orchestration (shipped)
Section titled “Cross-node orchestration (shipped)”While the storage engine is single-node, cross-node agent coordination is supported via:
Scoped isolation lifecycle
Section titled “Scoped isolation lifecycle”Each module runs in an isolated scope. Isolation boundaries can be managed programmatically:
// Pause a module scope (e.g., for maintenance)IsolationScope::pause("module-name")?;
// ResumeIsolationScope::resume("module-name")?;Rollout and restore
Section titled “Rollout and restore”Controlled rollout of new module versions:
cosmictron-cli deploy my-agent.wasm --name my-agent --rollout 10%# 10% of new sessions use the new version
cosmictron-cli deploy my-agent.wasm --name my-agent --rollout 100%# Full rollout
cosmictron-cli rollback my-agent# Restore previous versionExternal agent coordination
Section titled “External agent coordination”Cosmictron exposes a stable WebSocket + HTTP surface. Multiple Cosmictron instances can coordinate by reading from each other’s HTTP REST endpoints, using the event log as the coordination primitive.
Multi-region Coming Soon
Section titled “Multi-region ”Multi-region active-active deployment with automatic failover is planned. The design involves:
- Region-local primaries — writes committed locally first
- Async cross-region replication — event log replicated to other regions
- Conflict resolution — CRDT-based for agent state, strict ordering for financial events
There is no committed timeline for multi-region. It will not ship before a stable single-node product is proven in production.
Distributed Key Generation (DKG) Coming Soon
Section titled “Distributed Key Generation (DKG) ”The current FROST threshold signing implementation requires key shares to be pre-distributed out-of-band. A built-in Distributed Key Generation (DKG) protocol that generates threshold keys without a trusted dealer is planned.
DKG is a prerequisite for fully trustless multi-party signing in multi-region deployments.
Horizontal scaling today
Section titled “Horizontal scaling today”Until multi-region ships, horizontal scaling options are:
- Vertical scaling — Cosmictron is single-binary and scales well vertically. A 32-core, 128 GB instance handles very large workloads.
- Module sharding — deploy multiple Cosmictron instances, each owning a shard of sessions by ID prefix, with a lightweight routing layer (nginx, Cloudflare) in front.
- Read replicas — serve read-only PgWire queries from a snapshot-based replica (manual setup today).