Horizontal Scaling Architecture
One image. Every jurisdiction. 190+ services per node, 20 validators across 13 compliance zones, zero single points of failure. Every node is byte-for-byte identical - from Zurich to Singapore to Dallas. Add a jurisdiction by deploying the same image.
Current Deployment
DevNet (Hetzner)
1 instance in Nuremberg (nbg1). 190+ services running on a single Hetzner CPX62 instance. Hosts all microservices, portal, and CI/CD pipeline. JILHQ fleet controller runs on a dedicated server.
MainNet (Hetzner)
20 validators across 13 jurisdictions on 4 continents. 14-of-20 (70% BFT) consensus with P2P settlement via RedPanda. Each jurisdiction runs the full service stack.
Expansion (Planned)
14-of-20 consensus (70% BFT). Each jurisdiction runs the full service stack. No single nation or datacenter can halt the network.
One Image. Every Jurisdiction.
JIL Sovereign deploys the same container stack to every node in every jurisdiction. There is no "primary" server. There is no "secondary" server. Each node runs the full 190+ service stack, its own PostgreSQL instance, its own Redis cache, its own validator, and its own RedPanda broker. A node in Zurich is byte-for-byte identical to a node in Singapore, Abu Dhabi, or Dallas.
This means: spin up a new jurisdiction by deploying the same image. No special configuration. No hand-wiring. The node joins the RedPanda cluster, catches up on events, and starts serving traffic.
Identical Deployment
Same Docker images, same compose file, same config. Region is an environment variable, not a code change.
Session Affinity
Where a request starts is where it finishes. No mid-flight handoffs. No cross-server state lookups during processing.
Event-Driven Sync
RedPanda propagates every state change to every node. PostgreSQL consumes events with source-id tracking. Eventually consistent, always available.
Jurisdiction Autonomy
Each node enforces its local compliance rules. FINMA in Zurich. MAS in Singapore. FCA in London. Compliance is local, consensus is global.
20+ Nodes Across 13 Jurisdictions
Each node is a full-stack deployment running the entire platform. Nodes are grouped by jurisdiction for compliance purposes, but any node can process any request. Cloudflare routes users to the nearest healthy node.
What Runs on Every Node
Each node is a self-contained deployment of the entire platform. Nothing depends on another node being available to process requests. The only cross-node communication is event propagation via RedPanda.
RedPanda: The Global Event Bus
RedPanda is the backbone of cross-node synchronization. Every state change on any node is published as an event to RedPanda. Every other node's PostgreSQL instance consumes those events and applies them locally. The source-id on each event tells every consumer which node originated the change - so a node never re-applies its own writes.
Writes to local PostgreSQL
Emits event with source-id
Replicates across brokers
Delivers to all consumers
Checks source-id ≠ self
Applies state change locally
Event Schema: Every state change emitted to RedPanda follows an envelope containing a globally unique event_id, the originating source_id (node name), topic, wall-clock timestamp, monotonic sequence number, and the payload with transaction details.
PostgreSQL as Consumer: Each node's PostgreSQL instance subscribes to RedPanda topics via a dedicated consumer service. When an event arrives, the consumer checks the source_id field: if it matches the local node, the event is skipped (the write already happened locally). Otherwise, the state change is applied. This prevents duplicate writes and ensures every node converges to the same state.
Session Affinity: Start Here, Finish Here
When a user initiates a service request - a DEX trade, a bridge transfer, a settlement - the node that receives the request is the only node that processes it. There is no mid-flight handoff to another server. The originating node executes the full request lifecycle, writes results to its local database, then emits the completed event to RedPanda for all other nodes to consume.
A request that starts on Node A will always complete on Node A. Other nodes learn about the result asynchronously via RedPanda. This eliminates distributed locking, cross-node latency, and split-brain scenarios during request processing.
Session Affinity Flow - DEX Trade Example:
1. User submits DEX order to Node SG-MAS-01 → 2. Validate + match locally → 3. Execute trade in local DB → 4. Return result to user → 5. Emit jil.events to RedPanda → RedPanda propagates event with source_id: SG-MAS-01 → All other nodes consume event, check source_id ≠ self, apply to local PostgreSQL → State converged.
This model means there is zero cross-node coordination during request processing. Latency is purely local. The user gets their response from the nearest node in milliseconds. The rest of the network catches up in the background.
Why This Architecture Has No Bottlenecks
No Shared Database
Each node has its own PostgreSQL. No connection pool contention. No cross-region latency on reads. Writes are local, replicated via events.
No Distributed Locks
Session affinity means one node owns each request. No distributed mutex, no two-phase commits, no cross-node coordination during processing.
No Central Router
Cloudflare geo-routes to nearest node. No load balancer bottleneck. Each node handles its own traffic independently.
Linear Horizontal Scale
Adding a node adds capacity. 20 nodes handle 20x traffic. 50 nodes handle 50x. No diminishing returns from coordination overhead.
Fault Isolation
If a node goes down, Cloudflare routes traffic to the next nearest node. No cascade failures. The rest of the network is unaffected.
Compliance Locality
Each node enforces its jurisdiction's rules natively. No cross-border data movement during request processing. Regulators audit their local node.
N nodes = N x (single-node throughput). No coordination tax. No leader election. No consensus bottleneck on application requests. Validator consensus is only for ledger finality, not for service requests.
Consensus Layer vs Application Layer
A critical distinction: validator consensus (14-of-20 BFT) is only required for ledger finality - confirming blocks and cross-chain bridge operations. Application-layer requests (DEX trades, wallet operations, compliance checks, document vault) are processed locally with session affinity and propagated via events. This means application throughput scales linearly with nodes, while consensus throughput is governed by the validator quorum.
Consensus Layer
14-of-20 Validator Quorum
Block finalization (1.5s), cross-chain bridge attestation, governance parameter changes, emergency halt/resume.
Bounded by quorum speed
Application Layer
Session-Affinity Per Node
DEX trading, RFQ, AMM, wallet operations, transfers, compliance checks, KYC/AML, document vault, proof layer.
Scales linearly with nodes
Adding a New Node
Deploying a new jurisdiction takes one command. The new node pulls the same container images, connects to the RedPanda cluster, and replays events to build its local state. Once caught up, Cloudflare begins routing traffic to it.
Deploy a new node in Toronto, Canada: Set JIL_NODE_ID=CA-TOR-01, JIL_REGION=ca-toronto, JIL_JURISDICTION=CA/OSC, and REDPANDA_SEEDS to the existing cluster. Run docker compose up. The node connects to RedPanda, replays events, builds local state. PostgreSQL catches up from the event stream with source_id filtering. The validator joins the consensus network via peer discovery. Cloudflare health check passes and traffic begins routing.
Time to operational: minutes, not weeks. The node is identical to every other node. The only unique values are JIL_NODE_ID, JIL_REGION, and JIL_JURISDICTION. Every image is pulled from JILHQ's secure registry after verifying cryptographic signatures.
JILHQ: The Fleet Command Center
JILHQ is the central control plane that governs the entire node fleet. It does not process user traffic. It does not participate in consensus. Its sole purpose is to manage, authorize, and govern every node in the network. No node can join the network, pull images, or start services without JILHQ's authorization.
JILHQ manages the fleet. Nodes serve the users. JILHQ never touches user data, transactions, or wallets. It controls who can run the software and what version they run - nothing more.
Image Registry
Signed container images. Devnet, testnet, mainnet tracks. Ed25519 signatures.
Certificate Authority
mTLS certificates for node auth. Revoke = instant lockout.
Fleet Management
Start, stop, upgrade, rollback, deprecate. Rolling upgrades.
Alert Engine
60s evaluation loop. Auto-resolve. HMAC-signed webhooks to Slack/PagerDuty.
Audit Ledger
Append-only log of every fleet action. Full chain of custody.
Fleet Dashboard
Single pane of glass. Node status, alerts, images, security. Zero SSH.
Signed Images & Secure Registry
Every container image deployed to any node must be cryptographically signed by JILHQ before a node will run it. Nodes verify signatures at pull time using the JIL root public key. An unsigned or tampered image is rejected immediately. This ensures that every node in every jurisdiction is running code that JILHQ has explicitly authorized.
Image Signing Pipeline: Developer pushes code to repo, CI/CD builds image, image tagged with commit hash. JILHQ Signer verifies build provenance, signs image with HSM key, stores in private registry. Node pulls image from registry, verifies signature vs root key, runs only if valid.
Image Signature Envelope: Every image in the JILHQ registry carries a manifest containing the image name, SHA256 digest, signing key identifier, ed25519 signature, timestamp, track (devnet/testnet/mainnet), commit hash, and multi-party approval list.
Devnet
Automatic deploy on merge. Unstable, fast iteration. Internal testing only.
Testnet
Requires 1 approval. Load tested, integration tested. Partner & validator preview.
Mainnet
Requires 2 approvals. HSM-signed, audit-logged. Rolling deploy to nodes.
Fleet Management: Start, Stop, Upgrade, Revoke
JILHQ exposes a Fleet Management API that provides full lifecycle control over every node in the network. Every operation is authenticated via mTLS, authorized via RBAC, and logged to the immutable audit ledger.
Start / Stop
Bring a node online or take it offline gracefully. Cloudflare health checks automatically route traffic away from stopped nodes.
Rolling Upgrade
Push a new image version to nodes one-by-one or by jurisdiction. Each node pulls the new signed image, restarts, and rejoins. Zero downtime.
Rollback
Instant rollback to any previous signed image. JILHQ pins the target version and nodes revert on next health cycle. No manual SSH.
Revoke
Permanently revoke a node's certificate. The node is immediately cut off from the registry, RedPanda cluster, and peer network. Nuclear option.
Health Monitoring
Every node reports health, version, and sync status to JILHQ. Dashboard shows fleet-wide view. Alerts on drift, lag, or failures.
Config Push
Push compliance zone configs, fee schedules, and parameter updates to specific nodes or jurisdictions without a full image redeploy.
Fleet Management API examples: GET /v1/fleet/nodes (list all nodes and status), POST /v1/fleet/nodes (start a new node), POST /v1/fleet/nodes/:id/upgrade (upgrade a specific node), POST /v1/fleet/jurisdictions/:zone/upgrade (upgrade all nodes in a jurisdiction), POST /v1/fleet/nodes/:id/stop (emergency stop), DELETE /v1/fleet/nodes/:id (revoke permanently), POST /v1/fleet/jurisdictions/:zone/config (push config update), GET /v1/fleet/audit (view audit log).
Node Authorization & mTLS
No node can participate in the JIL network without authorization from JILHQ. Each node receives a unique mTLS certificate from JILHQ's Certificate Authority. This certificate is required for three things: pulling images from the registry, connecting to the RedPanda cluster, and joining the validator peer network.
Sends hardware attestation + jurisdiction + operator identity to JILHQ
Operator KYC validated, jurisdiction confirmed, hardware meets requirements
Unique cert with node_id, jurisdiction, and expiration. Signed by JIL CA root key.
Image pulls, RedPanda connections, validator peer-to-peer - all require valid cert
JILHQ adds cert to CRL. Node cannot pull images, connect to RedPanda, or join peers. Immediate, irreversible.
A node's mTLS certificate is the single credential that unlocks the image registry, the RedPanda cluster, and the validator network. Revoke the cert and the node is completely severed from all three - no partial access possible.
Immutable Audit Ledger
Every action taken by JILHQ is logged to an append-only audit ledger. Who pushed an image, who approved a promotion, who started a node, who revoked a certificate - every operation has a permanent, tamper-evident record. This is the chain of custody for the entire fleet.
Sample audit entries: image.push (ci-pipeline pushes wallet-api:v95.3 to devnet), image.promote (ops-lead promotes from devnet to testnet, 1 approval), image.promote (security-lead promotes from testnet to mainnet, 2 approvals), node.upgrade (ops-lead upgrades CH-ZUG-01 from v95.2 to v95.3 with rolling strategy), cert.revoke (security-lead revokes ROGUE-01 for unauthorized modification).
Fleet Operations Dashboard
The Fleet Operations Dashboard is a dedicated web interface that provides a single pane of glass over the entire JIL node fleet. It connects exclusively to JILHQ's APIs and presents real-time status across all 20+ nodes, 13 jurisdictions, and 190+ services per node. Operations staff never need to SSH into individual nodes - everything is managed from this dashboard.
Fleet Overview
Real-time status map of all nodes across 13 jurisdictions. Node count, healthy/degraded/offline breakdown, fleet-wide version distribution, and sync lag at a glance.
Node Detail
Drill into any node to see container-level health, CPU/memory/disk usage, running image versions, certificate expiry, and last heartbeat. Start, stop, upgrade, or deprecate from one screen.
Image Registry
Full image registry browser showing all container images across devnet/testnet/mainnet tracks. Sign images, promote across tracks, view signature chains, and trigger fleet-wide upgrades.
Alert Management
View all active alerts, acknowledge incidents, configure alert rules and thresholds. Webhook integration for Slack, PagerDuty, or any endpoint with HMAC-signed delivery.
Security View
Certificate expiry grid across all nodes, unsigned image detection, mTLS status, and security audit results aggregated from every node's automated security scans.
Broadcast Commands
Send fleet-wide commands: rolling upgrades, restart services, enter maintenance mode, or emergency stop. Target all nodes, a jurisdiction, or specific nodes.
The Fleet Dashboard communicates only with JILHQ APIs. No direct node access is ever needed. This means operations staff can manage a 20-node global fleet from a single browser tab with full audit logging of every action.
Alert System & Webhook Delivery
JILHQ runs a continuous background alert evaluation loop every 60 seconds. It checks fleet-wide conditions against configurable rules and fires alerts when thresholds are breached. Alerts are automatically resolved when conditions clear. Every alert is stored in PostgreSQL and optionally delivered to external systems via HMAC-SHA256 signed webhooks.
Alert Evaluation Pipeline: Node heartbeats (every 60s each node posts health data to JILHQ) → JILHQ Evaluator (background loop checks all rules against fleet state every 60s) → Alert Fired (stored in PostgreSQL, shown in dashboard, webhook delivered) → Webhook (HMAC-SHA256 signed, Slack, PagerDuty, 3 retries with backoff).
Heartbeat Stale
A node hasn't reported a heartbeat in 5+ minutes. Indicates node is offline, frozen, or network-partitioned. Auto-resolves when heartbeats resume.
Certificate Expiring
A node's mTLS certificate will expire within 7 days (critical) or 30 days (warning). Prevents surprise lockouts when certs expire.
Image Unsigned
A container image in the registry has no valid Ed25519 signature. No node will pull unsigned images, but this alerts operators to signing gaps.
Service Down
A node's health check reports one or more critical services as unhealthy. Triggers investigation before the node is automatically routed around.
Disk Critical
A node's disk usage exceeds 90%. At this threshold, services may fail to write logs or state. Daily maintenance scripts handle cleanup, but this catches edge cases.
Auto-Resolve
When a condition clears (heartbeat resumes, cert renewed, disk freed), the alert is automatically resolved. No manual acknowledgment needed for transient issues.
Self-Maintenance Agents
Every node in the fleet runs four automated maintenance agents via cron. These agents handle health monitoring, daily cleanup, security auditing, and image updates - without any human intervention. Each agent reports its findings back to JILHQ for centralized visibility and alerting.
Health Check Agent
Every 60 Seconds - Checks all container health states, auto-restarts unhealthy containers, reports CPU/memory/disk to JILHQ, posts heartbeat with container counts.
Maintenance Agent
Daily at 2:00 AM UTC - Docker image & volume prune, log rotation (100 MB threshold), disk usage check (85% warning), temp file cleanup (7-day max).
Security Audit Agent
Every 6 Hours - Verifies image signatures vs JILHQ, checks mTLS certificate expiry, scans for root-running containers, validates firewall & SSH config, detects unexpected open ports.
Image Update Agent
Every 4 Hours - Queries JILHQ for latest images, compares running digests vs registry, pulls & verifies new signed images, rolling restart per container, reports update results to audit log.
One-Command Node Setup: New nodes are bootstrapped with a single command (install-cron.sh) that installs the configuration file, creates log directories, and registers all four cron agents. The installer is idempotent - safe to run multiple times on the same node.
With these four agents running on every node, the fleet is largely self-maintaining. Unhealthy containers are auto-restarted, disk space is reclaimed nightly, security posture is audited 4 times a day, and image updates are applied automatically. Human intervention is only needed for escalations that reach the JILHQ alert system.
Full System Architecture
Fleet Operations Dashboard (Overview, Nodes, Images, Alerts, Security) communicates via API calls to JILHQ - Control Plane (Registry, CA, Fleet API, Alerts, Audit, Signer), which pushes signed images, mTLS certs, and alerts down to nodes. Cloudflare CDN geo-routes traffic to the nearest healthy node.
Node fleet: CH-ZUG, AE-ADGM, SG-MAS, US-DAL, and 16 more nodes - each running the full stack with 190+ services, PostgreSQL, Validator, RedPanda, and 4 self-maintenance agents. All nodes are connected peer-to-peer via the RedPanda Event Mesh (source_id tagging, PostgreSQL consumers, eventually consistent).
At the bottom layer: 14-of-20 Validator Quorum (CometBFT) handles block finality & bridge attestation with 1.5s block time, 70% BFT threshold, across 13 jurisdictions.
20 nodes. 13 jurisdictions. 190+ services. Zero single points of failure.
JIL Sovereign is the only settlement platform built for true horizontal scaling - identical deployments across every jurisdiction, session-affinity request handling, and RedPanda-powered event synchronization.