Platform / Architecture / AVA · v2.0 - AWS Bedrock + SageMaker

Three stages of agentic AI over the JIL detection layer.

JIL operates a three-stage analytical pipeline above the Tier 1 detection layer of JIL L1. AVA organizes Tier 1 findings into customer-readable form on Amazon Bedrock managed inference. AVA Pro is the agentic litigation-preparation layer, multi-agent orchestration that combines Bedrock Agents with self-hosted SageMaker LoRA-adapter inference for the specialized roles. AVA Pro+ is the final lock-down phase that produces filing-ready CREB(TM) bundles for customer counsel or referral agencies. Federal and CMS workloads run in AWS GovCloud (US-West) for FedRAMP High coverage; commercial MCO workloads run in AWS commercial regions under FedRAMP Moderate. All three stages process Protected Health Information under the AWS Business Associate Addendum. Every architectural decision flows from that fact.

Request a security review See HIPAA posture

JIL Sovereign customer flow - AVA / AVA Pro / AVA Pro+ pipeline — Fig. 01 - JIL Sovereign customer flow · AVA pipeline on AWS

Section 01c - Definition

What we mean by Agentic AI.

"Agentic AI" is one of the most overloaded terms in 2026. Vendors use it for everything from chatbots to autoscaling scripts. Here is what it means inside JIL Sovereign, precisely.

Definition

Agents are role-specialized AI workers that act in sequence over a goal, each producing a structured artifact the next consumes, with every step auditable.

Three properties, all required:

Role-specialized. Not one mega-prompt. Each agent has a single narrow job (find evidence; map it to a statute; red-team the case; quantify damages; draft the memo; audit the chain). The prompt is locked + hashed; the output schema is rigid.
Sequential, coordinated. Agents run in a fixed canonical order. Each agent's output becomes the next agent's input. The orchestrator is plain code, not another LLM - no LLM-coordinating-LLMs hand-waving. Failure or low confidence on any agent halts the pipeline.
Auditable. Every agent invocation produces a SHA-256-hashed reasoning chain, anchored to JIL's CourtChain audit ledger before the next agent starts. The whole pipeline is bit-identically replayable two years later.

If a system is missing any of these three, it's not what JIL means by Agentic AI - it's an LLM with a label.

What it is NOT

Agentic AI is not a chatbot, not RAG, not "AI tools".

To draw the line cleanly:

Not a chatbot. No human in the loop sending free-form messages mid-pipeline. Each agent runs once per case, takes a structured input, returns a structured output, and exits.
Not just RAG. Retrieval-augmented generation is one technique an agent uses (the Theorist retrieves controlling case law from the legal corpus). But "RAG" alone - one LLM + one vector store - is not agentic. Agentic adds role specialization + cross-checking + audit anchoring on top.
Not "AI tools" or "AI features". Agentic means the AI drives the workflow - it decides which legal theories to propose, which defenses to anticipate, which damages methodology to recommend. A human counsel reviews; the AI doesn't wait for instructions at each step.
Not autonomous. The Auditor agent is a hard gate. If it flags a "blocker" gap, the bundle does not seal. A human reviewer must clear the gap before AVA Pro+ produces the filing-ready CREB.

Models JIL uses today (Phase 2, live)

Agent role	Model	Provider / hosting	Why this model	Stage status
Investigator	Anthropic Claude Sonnet 4.6	Amazon Bedrock `us.anthropic.claude-sonnet-4-6`	Heavyweight reasoning over claim sweep + pattern recognition; strong at structured extraction across long inputs	LIVE on Bedrock
Theorist	Anthropic Claude Sonnet 4.6	Amazon Bedrock `us.anthropic.claude-sonnet-4-6`	Statutory-element decomposition, controlling-authority citation, no-hallucination reasoning	LIVE on Bedrock
Adversary	Llama 4 / DeepSeek R1 class with LoRA	Amazon SageMaker (G6 / P5 / P4 GPU) + LoRA adapter per case theory (FCA, AKS / Stark, ERISA, RICO, MFCU)	Red-team / defense anticipation needs adapter-driven specialization beyond what general-purpose models can offer	Phase 3 (stub today)
Damages Quantifier	Llama 4 / DeepSeek R1 class with LoRA + SageMaker XGBoost	Amazon SageMaker GPU + SageMaker built-in XGBoost	Multi-methodology damages calculations need both reasoning (LLM) and quantitative anomaly detection (XGBoost)	Phase 3 (stub today)
Drafter	Anthropic Claude Haiku 4.5	Amazon Bedrock `us.anthropic.claude-haiku-4-5-20251001-v1:0`	Long-form generation (memo + CREB structure); roughly 3x cheaper per output token than Sonnet	LIVE on Bedrock
Auditor	Anthropic Claude Haiku 4.5	Amazon Bedrock `us.anthropic.claude-haiku-4-5-20251001-v1:0`	Structured gap-finding + chain-integrity walk; cost-efficient for the audit pass	LIVE on Bedrock

Supporting AWS services (Phase 2/3)

Inference resilience

Cross-region inference profiles

Bedrock invocations route through the us.* cross-region inference profile, distributing capacity across us-east-1, us-east-2, and us-west-2. If one region throttles, the next call lands elsewhere without code change. PHI residency stays US-only by published region scope + IAM Service Control Policy.

Legal corpus retrieval (Phase 3)

Bedrock Knowledge Bases + OpenSearch Serverless

Theorist and Auditor agents pull controlling case law from a managed RAG store backed by Amazon Bedrock Knowledge Bases over Amazon OpenSearch Serverless, with embeddings from Amazon Titan Text Embeddings v2. Corpus: Harvard CAP, CourtListener federal opinions, PACER bulk filings, CMS regulations, HHS OIG advisory opinions, DOJ FCA settlement library, state MFCU precedent.

Workflow + audit

Step Functions, EventBridge, SQS, CloudTrail, KMS

AWS Step Functions orchestrates the six-agent state machine. EventBridge fans agent-completed events to downstream consumers (CourtChain anchor, billing, customer notifications). SQS queues batch case intake. CloudTrail logs every Bedrock invocation (input + output hashes only, never content) to S3 with Object Lock immutability. AWS KMS / CloudHSM hold the per-tenant PHI tokenization vault keys.

Single-vendor inference posture, by design. All four currently-LIVE agents invoke Anthropic Claude on Amazon Bedrock - no calls to OpenAI, Google Gemini, the Anthropic Direct API, or any third-party LLM service. The single-vendor / single-cloud posture is a HIPAA precondition, not a cost optimization: Bedrock is in scope for the AWS Business Associate Addendum, contractually does not retain customer prompts, and the entire inference path stays within JIL's AWS account boundary. Phase 3 SageMaker LoRA adopts the same posture (still inside AWS BAA).

Section 01 - Pipeline

One pipeline. Three lock-in points. One audit chain.

Each stage carries a distinct analytical scope, AWS compute target, and PHI handling regime. The customer chooses where to stop.

Stage 1 / Tier 1 to Tier 2

AVA · on Amazon Bedrock

148 deterministic rule checks across 15 categories. AVA consumes Tier 1 flagged claims from JIL L1, groups related flags into coherent finding clusters, and produces customer-readable summaries. Inference runs on Amazon Bedrock managed APIs - no GPU provisioning, per-token pricing aligned to query volume. JIL retains the right to swap or add Bedrock-hosted models without architectural change; the system prompt + JSON-output contract is portable across Anthropic Claude families. Bedrock contractually does not retain customer prompts or use them for model improvement.

Stage 2 / Tier 2 to Completion

AVA Pro · Six specialized agents on Bedrock

Six specialized agent roles, each with a distinct prompt + JSON output contract. Today four are wired to Amazon Bedrock via cross-region inference profiles for capacity-based resilience: Investigator + Theorist on Anthropic Claude Sonnet 4.6 (heavyweight reasoning, legal-element decomposition, statutory mapping); Drafter + Auditor on Anthropic Claude Haiku 4.5 (long-form generation, evidentiary-gap structured analysis, ~3x cheaper per output token). The remaining two roles - Adversary (red-team / defense anticipation) and Damages Quantifier (multi-methodology damages with statistical confidence intervals) - run on Amazon SageMaker self-hosted inference with LoRA adapters per case theory (FCA, Anti-Kickback / Stark, ERISA, civil RICO, state insurance fraud, MFCU referral); Phase 3.

Stage 3 / Filing

AVA Pro+ · Filing-ready CREB(TM)

Final lock-down. Filing-ready bundles for counsel. Inherits AVA Pro's analytical scope and produces filing-ready CREB(TM) packages for customer counsel or referral agencies. Every agent reasoning chain is hashed and anchored to CourtChain via AWS CloudTrail + CloudWatch event hooks; the resulting evidence bundle is FRE 902(14) self-authenticating. Damages calculations carry methodology variance + 95% confidence intervals (SageMaker XGBoost for time-series anomaly base rates). The Auditor agent walks the prior five agents' chain integrity before the bundle is sealed; if any gap rises above "minor" severity, the bundle is held for human review.

Section 01b - What "agentic" actually means here

Six agents, one chain, every reasoning step hashed.

Most "AI in healthcare" today means a single LLM behind a chat box. AVA Pro is the opposite. Each role is a separately-prompted agent with its own output contract, its own model assignment, its own evidence trail. The orchestrator runs them in canonical order, anchors each result to CourtChain before the next agent starts, and stops at the Auditor's go/no-go.

Why six agents instead of one

Specialization beats prompt-stuffing

One mega-prompt asking a single LLM to "find fraud, propose theories, anticipate defenses, calculate damages, draft the memo, audit the chain" produces hand-wavy aggregates. Six narrow prompts, each constrained to a single output schema, produce auditable structured artifacts that downstream tooling can verify. Each agent's output is the next agent's input - no agent sees the customer's raw claim universe, only the prior agent's structured findings + the legal corpus.

Why deterministic mode

Temperature 0, fixed prompts, replayable

Every agent invocation runs at temperature: 0 with a SHA-256-pinned system prompt. The ModelCard inside each AgentResult records the exact model id, the prompt hash, the AWS region, the input/output token counts. Two years from now, with the same model id available on Bedrock, the same prompt hash, and the same tokenized input, the call replays bit-identically. That replay capability is what makes the FRE 902(14) self-authentication claim hold up against a defense expert.

Why model assignment per role matters

Right tool, right cost, per agent

The Investigator and Theorist do most of the legal-reasoning work. They get Claude Sonnet 4.6 - the heavier model, ~$3 / M input + $15 / M output. The Drafter and Auditor do generation and structured-gap analysis. They get Claude Haiku 4.5 - one-third the cost, sufficient quality for the task. Per-case spend lands at roughly $0.14 across all four Bedrock invocations. We can swap a single role to a different model without touching the rest of the pipeline.

Per-tenant inference profiles

Each customer carries its own posture

Inference mode, AWS region, model overrides, KMS key, BAA timestamp, and cost cap are all per-tenant - not process-global. A federal customer routes to AWS GovCloud (US-West) automatically; a commercial MCO routes to us-east-1. A pre-BAA POC customer is forced into stub mode by the orchestrator's BAA gate (the code refuses to invoke Bedrock if baa_executed_at is null). Each case has a hard cost ceiling beyond which the orchestrator aborts and surfaces a partial CREB with the cost-cap marker.

Cross-region inference resilience

Bedrock routes across us-east-1 + us-east-2 + us-west-2

JIL invokes Bedrock through the us.* cross-region inference profile - a managed router that distributes capacity across three US regions. If one region throttles, the next call lands somewhere else without code change. The IAM policy authorizes invocation in each of the three regions plus the inference profile itself. PHI residency stays US-only by Service Control Policy + the inference profile's published region scope.

CourtChain dual-purpose audit

One anchor, two regulatory regimes

Each agent run's output hash is anchored to JIL's CourtChain (the L1 audit ledger). The same anchor satisfies HIPAA Security Rule audit-log integrity (45 CFR §164.312(b)) and FRE 902(14) self-authentication for evidentiary admissibility. CloudTrail logs land in S3 with Object Lock so they're immutable before the chain hash is computed. One mechanism, two compliance frameworks - no parallel audit infrastructure to maintain.

Service status (live). AVA Pro is deployed today on JIL's portal infrastructure, container ava-pro healthy, cross-region Bedrock invocation verified end-to-end against AWS account 884135110852 with the AWS Business Associate Addendum executed. Per-vendor profile system live; default profile in stub mode (zero AWS spend) until a customer profile flips to aws-jil-tenant with a BAA timestamp. Phase 3 work - SageMaker LoRA endpoints for Adversary + Damages Quantifier, full legal-corpus RAG via Bedrock Knowledge Bases, AWS GovCloud deployment for federal/CMS workloads - tracked in the engineering roadmap.

Section 02 - HIPAA / PHI / PII handling on AWS

PHI is on a tightly bounded path. Everything else flows.

JIL's runtime touches three classes of data - PHI (HIPAA Safe Harbor identifiers), PII (employee credentials, contact records), and non-PHI operational data (pattern signatures, model weights, legal corpus, CourtChain anchors). The architectural goal is to keep PHI on a tightly bounded path inside the JIL AWS tenant and let everything else flow freely through the analytical layer.

Core principle 01

Minimum necessary, tokenized at boundary

AVA and AVA Pro process only the PHI required for the analytical task. PHI identifiers are replaced with format-preserving tokens (FF3-1 / FPE-AES per identifier class) at the JIL ingest boundary. Tokenization runs in AWS Lambda or ECS within the customer-tenant VPC. The crosswalk lives in AWS KMS with customer master keys (CMK), with AWS CloudHSM (FIPS 140-2 Level 3 validated) for the highest-sensitivity workloads. Re-identification happens only at the customer-facing CREB output boundary.

Core principle 02

No PHI in model training

Foundation models are inference-only on customer data. No customer PHI is used for fine-tuning, weight updates, RLHF, or any training process. AWS Bedrock contractually does not retain or use customer prompts for model improvement. Continued pre-training and supervised fine-tuning happen on the (PHI-free) legal corpus only, on SageMaker training jobs.

Core principle 03

Ephemeral inference

Bedrock invocations are stateless by contract. SageMaker KV caches are cleared between sessions. No persistent storage of customer prompts in inference infrastructure. Logs capture input and output hashes only via CloudTrail + CloudWatch - never content. Every PHI access is logged and anchored to CourtChain.

Core principle 04

No external API egress beyond AWS

All inference happens within JIL's AWS account boundary, under the AWS Business Associate Addendum. No data flows to OpenAI, Google, or any non-AWS LLM service. Bedrock keeps customer data within the JIL AWS tenant and does not share with third-party model providers. VPC endpoints + VPC-only routing for Bedrock and SageMaker invocations.

Core principle 05

PHI residency

PHI stays within US AWS regions. Commercial workloads in us-east-1, us-east-2, us-west-2. Federal and CMS workloads in AWS GovCloud (US-West). No cross-border PHI movement. AWS Region service constraints enforced at the IAM and SCP layer (preventive, not detective).

Core principle 06

Immutable audit, dual-purpose anchoring

Every PHI access is logged via AWS CloudTrail and CloudWatch Logs, then anchored to CourtChain. The same anchor that supports FRE 902(14) admissibility also satisfies HIPAA Security Rule audit log integrity (45 CFR 164.312(b)). One mechanism, two regulatory regimes. CloudTrail logs are themselves immutable (S3 Object Lock + bucket policies that deny delete) before the CourtChain hash is computed.

Compliance posture · AWS-inheritable controls. NIST CSF 2.0 self-attestation (current). HIPAA Security Rule attestation (current). HITRUST CSF i1 then r2 (target) - HITRUST inheritance from AWS shortens audit scope substantially. SOC 2 Type II (target) - AWS SOC 2 Type II inheritable as carve-out for the infrastructure layer. FedRAMP Moderate authorization on the JIL agency-sponsored track (target); AWS Bedrock and SageMaker are FedRAMP High authorized in GovCloud, providing inheritable controls. State medical privacy overlays (CA CMIA, TX HB 300, NY SHIELD, etc.) per customer. CMS ARS 5.1 + MARS-E 2.2 + DUA alignment for Medicare/Medicaid customer data flows. 42 CFR Part 2 controls if SUD records present. Full HIPAA posture at /hipaa.

Section 03 - Customer deployment

Three deployment modes. Customer chooses where the PHI sits.

PHI residency is the lever; everything else (analytics quality, audit chain, FRE 902(14) admissibility) is identical across modes.

Mode A - Default

Standard - JIL AWS Tenant

JIL operates AVA / AVA Pro / AVA Pro+ on AWS under JIL's BAA umbrella. Customer signs BAA with JIL; JIL holds the AWS Business Associate Addendum. Customer claim data flows to the JIL ingestion endpoint via mTLS / API Gateway with mutual TLS. Commercial in us-east-1 / us-east-2 / us-west-2; federal/CMS in AWS GovCloud (US-West). Suitable for most MCO and payer customers.

Mode B - Sovereign data

Customer AWS Account Deployment

JIL software stack deploys into the customer's own AWS account via CloudFormation / Terraform module. PHI never leaves customer AWS environment. JIL provides operator access under customer-managed BAA + AWS Resource Access Manager. Customer pays for compute directly. Suitable for customers with data-sovereignty mandates that prohibit offsite PHI.

Mode C - Hybrid

Customer-Edge Tokenization

Customer deploys the JIL tokenization edge appliance (AWS Outposts or on-prem container) in their environment. PHI is tokenized inside the customer perimeter; the tokenized payload flows to JIL AWS-hosted analytics via PrivateLink. Crosswalk stays in customer environment. Suitable for customers who want JIL-managed analytics but cannot allow raw PHI to leave their perimeter.

Section 04 - Customer security review

Pre-built responses for every common questionnaire.

Customer security review teams typically resolve in 4 to 8 weeks once a questionnaire is submitted. The shift to AWS Bedrock + SageMaker materially shortens the review window because most infrastructure-layer controls are inheritable from AWS's existing certifications.

Questionnaire frameworks

Pre-built response library

Whistic
OneTrust Vendorpedia
ProcessUnity
KY3P (Hitachi)
SIG Lite and SIG Core
HITRUST MyCSF

Response library updated quarterly. Customer-specific overlays handled at intake.

Evidence pack

What you get on day one of review

SOC 2 Type II report (target; AWS carve-out for infra layer)
HITRUST CSF certification (target i1 then r2; AWS inheritance)
HIPAA Security Rule attestation
AWS Business Associate Addendum evidence
FedRAMP High inheritance documentation (Bedrock, SageMaker, GovCloud)
Annual penetration test report (third-party)
Architecture review materials, data flow diagrams
BAA template and signature workflow

Section 05 - Engagement

Begin a security review.

Customer security teams open conversations with JIL by requesting a briefing. We will share the full reference architecture under NDA and walk the BAA structure, deployment-mode trade-offs, AWS service inventory, and PHI handling principles in depth.

Request a briefing HIPAA posture detail Sovereign Stack deployment