SentinelAI Fleet Inspector
Autonomous validator monitoring, threat scoring, and quorum-protected auto-remediation for institutional settlement infrastructure. Always-on fleet guardian built into JILHQ - continuously watching every validator in the network.
Autonomous monitoring that bridges the gap between manual operations and unconstrained automation.
SentinelAI is JIL Sovereign's always-on fleet guardian - an autonomous monitoring and remediation engine built into JILHQ that continuously watches every validator in the network. Every 60 seconds, SentinelAI evaluates 20 configurable rules across four categories, computes per-node threat scores, and auto-executes safe remediation while escalating high-risk actions for human approval.
SentinelAI bridges the gap between fully manual operations (too slow for production) and unconstrained automation (too dangerous for consensus networks). It enforces quorum-aware, rate-limited auto-remediation with human approval gates for high-impact actions - guaranteeing that automated fixes never compromise network liveness.
Inspection Pipeline
Heartbeat
Agents collect 5 metric sources every 60s
Evaluate
20 rules checked against latest metrics
Score
Weighted threat score per node (0-100)
Decide
Auto-execute, escalate, or observe
Act
HMAC-signed command to validator agent
Nine capabilities. One autonomous guardian.
Every feature is designed to maximize fleet uptime while mathematically guaranteeing that automated actions never compromise consensus.
20-Rule Threat Engine
Security, performance, availability, and fleet health rules evaluated every 60 seconds with configurable thresholds and per-rule cooldowns.
Real-Time Threat Scoring
Weighted threat scores (0-100) per node with health inversion, trend detection (spike/rising/falling/stable), and fleet-wide aggregation.
Quorum Protection
Every auto-action is gated: if executing would drop healthy nodes below consensus threshold, the action is blocked and escalated to operators.
Auto-Remediation
Safe actions (refresh, cycle, pause) execute automatically. High-risk actions (reboot, go_offline) require human approval. Rate-limited to 5/hr fleet-wide.
Observation Windows
Non-emergency rules require 3 consecutive triggering cycles (~3 min) before firing. Prevents transient spikes from causing unnecessary remediation.
Enhanced Heartbeats
5 metric sources per node (RedPanda, settlement, system, consensus, security) with 3s fail-open timeouts. ~2-5KB per heartbeat, transmitted via Kafka.
Per-Node Drill-Down
API endpoints expose per-node metrics, contributing rules, score history, and active recommendations for operator dashboards.
Security Emergency Override
SEC_DIGEST_MISMATCH (image tampering) fires immediately, bypasses observation window, and overrides quorum protection. Compromised nodes are paused instantly.
Fleet-Level Analysis
Zone settlement imbalance detection, version drift tracking, and fleet-wide pattern correlation across all compliance zones.
Four rule categories. Twenty rules. Weighted threat scoring with quorum protection.
Rule Categories
| Category | Rules | Threat Points | Auto-Actions |
|---|---|---|---|
| Security (6) | Digest mismatch, config drift, unauthorized access, stale images, key expiry, peer drop | 10 - 25 | pause, refresh |
| Performance (6) | Settlement lag, settlement errors, slow processing, retry depth, consensus behind, throughput drop | 8 - 15 | cycle |
| Availability (5) | Container down, disk critical, memory high, RedPanda bad, heartbeat gone | 15 - 20 | cycle |
| Fleet (3) | Version drift, settlement stopped, zone imbalance | 5 - 12 | refresh |
Threat Scoring Model
| Metric | Formula | Range |
|---|---|---|
| Threat Score | SUM(rule.points * confidence / 100), clamped 0-100 | 0 - 100 |
| Health Score | max(0, 100 - threat * 1.2) | 0 - 100 |
| Risk Level | critical (>=70), high (>=40), medium (>=15), low (<15) | 4 levels |
| Trend | spike (delta >20), rising (>5), falling (<-5), stable | 4 states |
| Fleet Health | AVG(node health scores) | 0 - 100 |
| Fleet Threat | MAX(node threat scores) | 0 - 100 |
Quorum Protection Gate
Before any auto-action: if healthy_count <= max(7, ceil(total_validators * 0.7)) AND rule != SEC_DIGEST_MISMATCH, the action is BLOCKED and escalated to a human operator. Otherwise, the action is executed with full audit trail.
Rate Limiting
Fleet-wide: maximum 5 auto-actions per hour. Per-node: maximum 2 auto-actions per hour. Per-rule cooldown: configurable (default 30 minutes). Observation window: 3 consecutive triggering cycles before non-emergency rules fire.
70-80% cost reduction. 15-60x faster incident response.
Without SentinelAI
24/7 NOC team required (3 shifts x 2 operators = 6 FTEs). Estimated cost: $600K-$900K/yr in staffing alone. MTTD: 5-30 minutes. MTTR: 30-120 minutes. Human error risk during 3 AM incident response.
With SentinelAI
Single on-call engineer for escalations only. Estimated cost: $150K-$200K/yr (1 senior SRE + pager). MTTD: 60 seconds (fixed, deterministic). MTTR: <2 minutes for auto-actionable issues. Consistent, auditable response with zero fatigue.
SentinelAI reduces fleet operations cost by 70-80% while improving incident response time by 15-60x. For a 20-validator mainnet, the annual savings exceed $450K compared to a traditional NOC model.
No competing network has quorum-aware autonomous remediation.
SentinelAI provides capabilities that no other blockchain network offers - from full-stack monitoring to quorum-constrained auto-remediation.
| Competitor | Monitoring | Auto-Fix | Quorum-Aware | JIL Advantage |
|---|---|---|---|---|
| Bitcoin | Hashrate / mempool | None | No | Block-STM parallel execution vs sequential processing; real-time validator health, not just PoW metrics |
| XRP Ledger | UNL voting | None | No | 20 rule categories beyond simple UNL trust; ZK bridge proofs (Groth16) vs basic multi-sig |
| Binance (BNB Chain) | Validator health / slash logs | None | No | Autonomous remediation across 20 rule categories; TLA+ formal verification with runtime checking |
| Cosmos Hub | Block signing stats | None | No | Remediates before slashing; ZK bridge proofs (Groth16) vs simple multi-sig bridges |
| Ethereum (SSV) | Cluster health | Cluster rotation | Partial | Block-STM parallel execution vs sequential EVM; TLA+ formal verification with runtime invariant checking |
| Solana (Jito) | MEV metrics | None | No | Full stack: infra + consensus + settlement; TLA+ formal verification vs unverified runtime |
| Polkadot | Telemetry | None | No | Closes the loop with automated remediation; ZK bridge proofs (Groth16) vs relay chain verification |
| AWS/K8s | CloudWatch/probes | Scale/restart | No | Understands BFT consensus constraints; Block-STM parallel execution + formal verification built-in |
SLA enforcement. Audit trails. Insurance underwriting. Regulatory readiness.
SLA Enforcement
SentinelAI enables contractual uptime SLAs (99.9%+) by guaranteeing sub-2-minute remediation for common failure modes.
Audit Trail
Every detection, decision, and action is recorded in the database with timestamps, rule IDs, confidence scores, and execution results - satisfying compliance requirements for institutional custodians.
Insurance Underwriting
Autonomous monitoring with provable quorum protection reduces operational risk, enabling more favorable protection coverage underwriting terms.
Regulatory Readiness
Per-jurisdiction zone monitoring (13 compliance zones) demonstrates proactive supervisory controls to regulators across BaFin, FINMA, MAS, FinCEN, FCA, JFSA, FSRA, ESMA, CVM, and FATF.
Revenue Impact: Higher settlement throughput from faster issue detection prevents settlement queue backups, maintaining the 3-5 bps fee revenue stream. Institutional clients require demonstrable operational controls - SentinelAI is a differentiating feature in competitive evaluations. Autonomous fleet management justifies premium tier pricing for institutional custody clients.
Five architectural innovations that separate SentinelAI from generic monitoring.
Quorum-Constrained Auto-Remediation
Unlike Kubernetes or AWS ASG, SentinelAI mathematically guarantees that automated actions never reduce healthy validators below the BFT consensus threshold: max(7, ceil(total * 0.7)). Enforced at the code level, not as advisory guidance.
Multi-Dimensional Threat Scoring
Each node receives a composite threat score from up to 20 independent rules, each contributing weighted points scaled by detection confidence. A node can have multiple low-confidence detections that collectively indicate a problem.
Observation Windows with Emergency Override
The 3-cycle observation window prevents transient spikes from triggering remediation, reducing false positives by an estimated 60-80%. SEC_DIGEST_MISMATCH bypasses this safeguard and fires immediately.
Fail-Open Metric Collection
Each of the 5 metric sources collects independently with a 3-second timeout. If one source fails, the remaining 4 still report - ensuring inspector visibility even during partial node failures.
Fleet-Level Pattern Detection
Beyond per-node evaluation, SentinelAI evaluates fleet-level rules that compare metrics across nodes. Zone settlement imbalance detection identifies geographic outages that per-node rules would miss.
Deeply integrated with JILHQ, validator agents, and operator dashboards.
JILHQ (Port 8054)
SentinelAI runs within JILHQ, sharing authentication, database, and fleet control infrastructure. 7 API endpoints expose inspector status, per-node details, recommendations, and rule configuration.
Validator Update Agent (v4.0.0)
Enhanced heartbeat protocol collects 5 metric categories every 60s. Real digest verification, actual image pull timestamps, and HMAC failure tracking feed accurate data to the inspector.
Ops Dashboard
Four dashboard tiles (Services, Infrastructure, RedPanda, Alerts) consume inspector data. Per-validator breakdown tables show real-time fleet health at a glance.
Settlement Consumer
Per-zone settlement metrics (consumed, processed, failed, retry depth, avg processing time) feed 4 performance rules. Zone-level throughput tracking enables cross-zone health comparison.
A system for autonomous monitoring and remediation of a distributed blockchain validator fleet, comprising: a configurable rule engine evaluating a plurality of rules across security, performance, availability, and fleet health categories on a periodic inspection cycle; a threat scoring model computing per-node threat scores as the weighted sum of triggered rule points scaled by confidence, and deriving health scores, risk levels, and trend classifications from the threat scores; a quorum protection mechanism that prevents any automated remediation action from reducing the number of healthy validators below the greater of a fixed minimum or a percentage ceiling of total validators; rate limiting of automated actions at both the fleet level and per-node level with per-rule cooldown periods; an observation window requiring multiple consecutive triggering cycles before firing non-emergency recommendations; and a tiered auto-action policy wherein low-risk remediation commands are auto-executed while high-impact commands require human approval, with a designated security exception rule that overrides quorum protection for critical image integrity violations.
20 rules. 60-second cycles. Zero consensus compromise.
SentinelAI is live on MainNet - continuously protecting 20+ validators across 13 jurisdictions with autonomous threat detection, quorum-aware remediation, and full audit trails.