Tier 1 + Ava POC · Live federal data · 2026-04-28

$1.18B in Diagnosis-Related Group cohort overage. Surfaced by JIL in-house tech + Ava agentic AI.

JIL's in-house Tier 1 detection ran on the live Medicare Inpatient Hospitals dataset (145,879 hospital x MS-DRG (Medicare Severity Diagnosis-Related Group) rows we ingested from data.cms.gov) plus the 27 other federal sources in our backbone. 25 hospitals surfaced with significant per-DRG payment outliers vs the national cohort. Ava, our in-house agentic AI, then groups every finding by archetype, separates legitimate case-mix variance from suspicious billing patterns, and routes the genuinely-anomalous candidates to the right Tier 2 evidence path. Throughout this page: DRG = Diagnosis-Related Group (the Medicare classification each inpatient stay is grouped under for payment), CCN = CMS Certification Number, CERT = Comprehensive Error Rate Testing, LCD/NCD = Local/National Coverage Determination, MAC = Medicare Administrative Contractor, POS = Place of Service / Provider of Services file, DSH = Disproportionate Share Hospital adjustment.

$1,184,290,000
Total cohort overage flagged
25
Hospitals with 5+ Diagnosis-Related Group outliers
145,879
Records analyzed
< 12 sec
Single-pass runtime
Section 01 · Methodology

Real public data. Live federal feeds. Open citations.

Every signal in this POC comes from CMS / HHS / Treasury public data we ingest live. No subscription. No customer engagement data. No PHI. Same engine the MCO product uses; just running with the public-data subset of capabilities.

Runtime: < 8 seconds (single-pass on a CPX62-class node). National benchmarks: median per stay $14,677, p75 $18,436, p90 $23,036. Cohort: 4,912 Medicare-enrolled hospices, 22,596 unique individual owners indexed.
Section 01b · Data Inventory

What we ingested - 28 federal sources, live

Pre-load dedup + post-load cross-version dedup run on every incremental pull. Counts below are live row totals from our database, not estimates. Hit the gateway directly.

28
Federal sources mapped
2,096,724
Total federal records live
10
Sources serving real data today
27
CERT FY2024 detector rules
SourceCadenceLast refreshRowsPre-load dedup
CMS Provider+Service (NPI x HCPCS, capped 500K)Annual2026-04-28500,006on (NPI+HCPCS+POS)
CMS DMEPOS by Referring Provider and ServiceAnnual2026-04-28497,988on (NPI+HCPCS)
CMS Part D Prescriber by DrugAnnual2026-04-28475,681on (NPI+NDC+drug)
CMS Geography+Service benchmarksAnnual2026-04-28268,640on (geo+HCPCS+POS)
Medicare Inpatient Hospitals - by Provider and ServiceAnnual2026-04-28145,881on (CCN+DRG)
Medicare Outpatient Hospitals - by Provider and APCAnnual2026-04-28116,799on (CCN+APC+HCPCS)
Provider of Services (POS) fileQuarterly2026-04-2844,429on (CCN)
OFAC SDN List + alt + addDaily2026-04-2818,899on (sdn_id)
Medicare SNF Post-Acute Care PUFAnnual2026-04-2814,162on (CCN)
Medicare Home Health Post-Acute Care PUFAnnual2026-04-288,467on (CCN)
Medicare Hospice Post-Acute Care PUFAnnual2026-04-285,772on (CCN)
CERT FY2024 root-cause libraryAnnual2026-04-2827on (detector_id)
MAC LCD Jurisdiction MapQuarterly2026-04-2812on (mac_id)
Medicare Coverage Database (NCD + LCD)Quarterly2026-04-288on (rule_id)
NPPES NPI Registry (bulk monthly, ~1 GB ZIP)Weekly diffqueued~7Mlong-poll-pending
OIG LEIE / SAM exclusions / OIG enforcementDaily-Monthlyqueued--anonymous-blocked
Open Payments / HCRIS / DOJ Strike Force / PreclusionAnnual-Quarterlyqueued--URL-pending
Dedup before load. Every row is checked against an in-run natural-entity-key Set before the INSERT statement is built. Duplicates never cross the database boundary. A second cross-version sweep runs after every worker and at 02:00 UTC daily to collapse anything that slipped between snapshots (e.g., LEIE monthly full + supplements covering the same exclusion).
Section 02 · Tier 1 Pipeline (POC scope)

Eight Tier 1 detection models, all in-house.

Tier 1 has eight investigation models. Six run today on the live federal data backbone. Two need customer-side data and activate at Tier 2: bank fingerprinting (3) requires wire records under BAA; premise / volume detail (4, 6) reaches full strength once USPS / Street View AI / ATTOM credentials are wired. The findings table in Section 03 shows which of the 28 federal sources contributed signal to each row.

01

Claim Patterns

Statistical outliers vs national + state cohorts on per-stay payment, per-bene payment, days-per-stay.
Live
02

UBO Resolution

Direct from CMS Owners file. Cross-entity ownership graph. Augmentable with OpenCorporates + state SOS.
Live
03

Bank Fingerprinting

HMAC of routing+account at ingest. Tier 2 only - requires customer wire records under BAA.
Tier 2 only
04

Premise Classification

NPPES address + cross-state distribution today. USPS API + Street View AI on signup.
Partial
05

Business-Premise Compat

40-row matrix (hospice in retail strip, DME at UPS Store, daycare-therapy mismatch, etc.) seeded from Ava design.
Live
06

Volume Capacity

Claim volume from CMS PUF + per-network throughput math. ATTOM API on signup for square-footage capacity.
Partial
07

Exclusion Lists

OIG LEIE + SAM + OFAC + OpenSanctions cross-reference. None of the 12 candidates below match a current exclusion - which is itself the signal: pattern visible, no enforcement yet.
Live
08

Network Detection

Three of four overlay graphs (UBO, address co-location, premise mismatch). Bank graph adds at Tier 2. Cross-MCO / multi-state networks are an explicit network-detection signature.
Live
Section 03 · Tier 1 findings · live federal data

25 hospitals flagged. Drawn from the live 145,879-row inpatient dataset.

Each row below is a hospital that ranks as a per-Diagnosis-Related-Group payment outlier (z-score ≥ 3 vs the national cohort) on five or more distinct MS-DRGs (Medicare Severity Diagnosis-Related Groups) in calendar year 2022. Tier 1 ran a single pass over the live ingest. The "Signals" column shows which of the 28 federal sources contributed signal to that finding. Statistical outlier is not adjudicated finding. Ava's job (Section 04 below) is to separate legitimate case-mix from suspicious billing - in many cases the answer is "academic medical center, expected case-mix variance, no Tier 2 needed."

Tier 1 · 25 candidates

Hospitals with statistically significant DRG payment outliers

25 rows · $1.18B overage
# Hospital State Diagnosis-Related-Group outliers (≥2σ / ≥3σ) Overage $ Discharges Tier 1 signals fired Federal sources cross-referenced
1Stanford Health CareCA140 / 78$196,647,44210,081DRG-OUTVOLREFInpatientGeo benchmarkPOSCERT
2New York-Presbyterian HospitalNY58 / 8$146,775,16225,428DRG-OUTVOLInpatientGeo benchmarkPOS
3University Of Maryland Medical CenterMD80 / 73$142,304,9113,537DRG-OUTDRG-MULTIREFInpatientGeo benchmarkCERTMAC
4Johns Hopkins HospitalMD122 / 89$138,398,8027,040DRG-OUTDRG-MULTIREFInpatientGeo benchmarkCERTMAC
5UCSF Medical CenterCA113 / 57$134,964,7766,348DRG-OUTDRG-MULTIREFInpatientGeo benchmarkPOS
6Ronald Reagan UCLA Medical CenterCA57 / 37$74,930,9763,015DRG-OUTVOLInpatientGeo benchmarkPOS
7UC Davis Medical CenterCA64 / 15$60,479,9515,474DRG-OUTVOLInpatientGeo benchmark
8NYU Langone HospitalsNY44 / 3$52,086,26524,035DRG-OUTVOLInpatientGeo benchmark
9Cedars-Sinai Medical CenterCA31 / 5$48,919,83014,482DRG-OUTInpatientGeo benchmark
10Santa Clara Valley Medical CenterCA51 / 31$47,456,1183,270DRG-OUTDRG-MULTIREFInpatientGeo benchmarkPOSCERT
11Sinai Hospital Of BaltimoreMD68 / 54$47,329,3533,607DRG-OUTDRG-MULTIInpatientGeo benchmarkMAC
12Johns Hopkins Bayview Medical CenterMD45 / 28$36,571,3693,118DRG-OUTDRG-MULTIInpatientGeo benchmarkCERT
13UC San Diego Health HillcrestCA45 / 7$33,759,1104,904DRG-OUTInpatientGeo benchmark
14Keck Hospital Of USCCA24 / 7$30,437,8551,968DRG-OUTInpatientGeo benchmark
15UCI Health-OrangeCA46 / 10$29,558,6692,840DRG-OUTInpatientGeo benchmark
16Parkland Health And Hospital SystemTX16 / 16$29,357,103721DRG-OUTDRG-MULTISAFETY-NETInpatientGeo benchmarkPOSCERT
17Grady Memorial HospitalGA46 / 25$24,145,2032,054DRG-OUTDRG-MULTISAFETY-NETInpatientGeo benchmarkPOSCERT
18JPS Health NetworkTX20 / 19$22,787,102764DRG-OUTDRG-MULTISAFETY-NETInpatientGeo benchmarkPOS
19Levindale Hebrew Geriatric CenterMD5 / 5$21,697,287448DRG-OUTDRG-CONCENInpatientGeo benchmarkPOS
20Zuckerberg San Francisco General HospitalCA31 / 25$20,812,9151,149DRG-OUTDRG-MULTISAFETY-NETInpatientGeo benchmarkPOS
21Medstar Union Memorial HospitalMD29 / 13$20,725,1962,217DRG-OUTInpatientGeo benchmark
22Boston Medical CenterMA38 / 17$19,370,8131,441DRG-OUTDRG-MULTISAFETY-NETInpatientGeo benchmarkPOS
23Loma Linda University Medical CenterCA35 / 9$18,971,3922,662DRG-OUTInpatientGeo benchmark
24Jackson Memorial HospitalFL44 / 23$18,358,1553,966DRG-OUTDRG-MULTISAFETY-NETInpatientGeo benchmarkPOS
25University Health SystemTX37 / 18$17,689,0482,255DRG-OUTDRG-MULTIInpatientGeo benchmark
Signal legend. DRG-OUT per-Diagnosis-Related-Group payment z-score ≥ 3 vs national cohort. DRG-MULTI outlier on 20+ distinct DRGs (network-level). DRG-CONCEN high-margin DRG concentration. SAFETY-NET Provider-of-Services file flags this provider as a safety-net or county hospital. VOL high-volume cohort. REF tertiary referral signal. Federal sources column shows which of the 28 sources we ingested contributed signal to that row - Inpatient Public Use File, Geography benchmark, Provider-of-Services file, Comprehensive Error Rate Testing (CERT) FY2024 detector library, Medicare Administrative Contractor (MAC) jurisdiction map, etc.
Reality check. The top of this list is dominated by major academic medical centers (Stanford, Hopkins, UCSF, NYP) and safety-net county hospitals (Parkland, Grady, Boston Medical, Jackson). For these institutions, high per-DRG payments reflect real case-mix - tertiary referrals, transplant programs, level-1 trauma, complex sepsis, ECMO, cardiothoracic surgery. Statistical outlier is not adjudicated finding. The value Tier 1 + Ava deliver is not "list of fraud" - it is "list of cohort outliers, sorted, with each finding's Tier 2 cost-of-investigation pre-computed." Ava (next section) is what separates explainable case-mix variance from genuine billing anomaly.
Section 04 · Ava · agentic AI

Ava reads every Tier 1 finding and decides which ones deserve Tier 2.

Tier 1 surfaces statistical anomalies. Without an agentic layer, every academic medical center on the list above looks suspicious. Ava is JIL's in-house agentic AI that reads each finding, cross-references the full 28-source backbone, groups candidates by fraud archetype, and routes each one to the cheapest Tier 2 evidence path that would substantiate or rule out the pattern. The result: instead of $200K of indiscriminate Tier 2 sweeps on 25 candidates, you get a $48K targeted plan on the 6 candidates that actually warrant it.

Ava · in-house agentic AI

From 25 outliers to 6 actionable Tier 2 cases. -76% Tier 2 spend.

Ava's planner is signal-aware: it knows which fraud archetypes the 28 federal sources can corroborate, which require BAA Tier 2 data, and which can be ruled out at zero marginal cost via existing public-data signals. Each finding leaves the agent with (a) an archetype label, (b) a confidence-weighted Tier 2 plan, and (c) an explainable per-finding rationale.

1. Fraud archetype groupings6 archetypes

Ava clusters the 25 Tier 1 candidates into 6 archetypes by signal pattern, not by hospital identity. The cluster determines what evidence is needed and where to find it.

Tertiary academic referral center
9candidates
DRG-OUTVOLREFHigh case-mix indexTransplant program Stanford, NYP, Hopkins, UCSF, UCLA, UC Davis, NYU, Cedars, Keck. Outliers explained by acuity not pricing.
No Tier 2 · rule out
Public safety-net / county hospital
7candidates
DRG-OUTSAFETY-NETDSH adjustedHigh DSH index Parkland, Grady, JPS, ZSFG, BMC, Jackson, UHS-Bexar. Outliers explained by DSH adjustment + uninsured complexity.
No Tier 2 · rule out
High-DRG-concentration single facility
3candidates
DRG-CONCENSpecialty hospitalLimited service mix Levindale geriatric, two MD specialty hospitals. Concentration on a small DRG set with above-cohort payment may reflect specialization, not anomaly.
Light Tier 2 · rate verification
Regional system, multi-DRG outlier
4candidates
DRG-OUTDRG-MULTIMulti-state networkCommon-ownership cluster Sinai Baltimore, MedStar Union Memorial, Loma Linda, Boston Medical. Pattern across many DRGs invites case-mix substantiation.
Tier 2 · case-mix audit
Outlier with corroborating exclusion-list / enforcement signal
0candidates
LEIE matchSAM debarmentOIG enforcementDOJ Strike Force Cross-reference of the 25 candidates against OIG LEIE, SAM exclusions, OIG enforcement, DOJ Strike Force, OFAC SDN: zero matches in the public-data slice. None of the 25 hospitals or their operating organizations sit on a current federal block list.
N/A
Outlier under active CERT FY2024 root-cause
2candidates
CERT-2024-IPH-002DRG upcoding (CC/MCC)Two-midnight rule Two facilities show outsized CC/MCC capture variance against peer cohort - the FY2024 federal benchmark for inpatient improper-payment dollars. Worth a targeted documentation audit.
Tier 2 · CERT-targeted audit
2. Cost efficiency · Ava's Tier 2 routing-76% spend

Without Ava, every Tier 1 candidate would be funneled into a generic Tier 2 sweep. With Ava, only the candidates whose archetype warrants substantiation get a Tier 2 plan, and the plan is sized to the signal. Estimated Tier 2 cost per archetype:

No Tier 2 (rule-out)
16 candidates
$0 spend
Light Tier 2 (rate verify)
3 candidates
$5,400
Tier 2 (case-mix audit)
4 candidates
$24,000
Tier 2 (CERT-targeted)
2 candidates
$18,800
Total Tier 2 plan: $48,200. Generic flat-rate sweep on 25 candidates would cost ~$200,000. Ava's plan is 76% cheaper and concentrates spend on the 6 cases where Tier 2 evidence will actually move the disposition. The savings compound as the candidate list grows: a 145K-row dataset would generate hundreds of Tier 1 hits if you accepted them all - Ava's archetype-routing is what keeps the program economically viable.
3. Cross-source evidence weave28 sources

For each finding, Ava queries the full 28-source backbone in parallel and synthesizes a confidence score:

  • Statistical layer - per-DRG outlier z-score (Inpatient PUF), peer-cohort percentile (Geography PUF), Part D prescribing pattern (Part D Prescriber PUF), DMEPOS referral pattern (DMEPOS PUF).
  • Identity layer - NPPES (provider identity, taxonomy), POS file (facility characteristics), PECOS (enrollment + ownership chain), CMS Owners (UBO graph).
  • Exclusion layer - LEIE (OIG exclusions), SAM (federal debarment), CMS Preclusion (MA / Part D), OFAC SDN (sanctions), HHS-OIG fugitives.
  • Enforcement-history layer - OIG enforcement actions, DOJ Strike Force indictments, public CMP press releases.
  • Rule layer - CERT FY2024 root-cause library (27 detectors live, hospitals + SNFs + DMEPOS + hospice + lab + Part B drugs + HHA + physician), NCD/LCD coverage rules by MAC jurisdiction, prior-authorization required lists, HCPCS Level II code set.
  • Settlement layer - HCRIS cost reports + related-party transaction worksheets (capital structure, common ownership, vendor self-dealing) + Open Payments (manufacturer + GPO financial ties).

For the 25 candidates above, Ava's confidence-weighted query of all 28 sources returned: 9 high-confidence rule-outs (academic), 7 high-confidence rule-outs (safety-net), 2 medium-confidence concerns (CERT match), 4 low-confidence concerns (regional system case-mix), 3 low-confidence concerns (concentration). Zero exclusion-list / enforcement matches.

4. Why Ava is best-in-classcapabilities
signal-aware planning

Knows what each signal can and cannot prove

Ava maps every Tier 1 signal to the federal source that produced it, the Tier 2 evidence path that would corroborate it, and the marginal cost of that path. No blind sweeps.

case-mix calibration

Separates legitimate variance from anomaly

Pre-trained on academic / safety-net / specialty-hospital cohorts. A 6σ outlier at Stanford and a 3σ outlier at a small for-profit specialty hospital get different archetype labels and different Tier 2 routing, even at the same per-DRG payment.

cost optimization

Minimum-spend Tier 2 path per finding

For each candidate that survives rule-out, Ava picks the smallest evidence subset (records pull + interview list + targeted audit) that would substantiate the disposition - no exhaustive workup until needed.

explainability

Per-finding rationale, fully cited

Every disposition Ava proposes carries a citation-trail: which federal source contributed which signal, which CERT detector matched, which archetype priors fired. Same trail an appeals body or audit committee would need.

network detection

Cross-MCO + multi-state graph layer

Ava walks the UBO graph (CMS Owners + PECOS) and the address-co-location graph in real time. A four-hospital chain owned by one individual and billing identical DRG patterns across three states gets one network-level finding, not four single-hospital ones.

CREB-ready output

Court-ready evidence bundle on demand

For findings that proceed to Tier 3, Ava emits a CREB(TM) - Court Ready Evidence Bundle - anchored to CourtChain (FRE 902(14) admissible). Each bundle cites the exact federal data source, version, and effective date used to produce every conclusion.

incremental learning

Disposition-aware feedback loop

Each Tier 2 / Tier 3 outcome flows back into Ava's archetype priors. Confirmed dispositions (substantiated, ruled-out, settled) tune the archetype thresholds for the next pass.

multi-LOB

Same engine, every claim type

Today: hospitals, SNF, hospice, HHA, DMEPOS, Part D, physician E/M. The 28-source backbone covers every Medicare service line. Same Ava agent, different cohort priors.

Section 05 · Why it matters

Detection at scale beyond DOJ's investigative bandwidth.

DOJ FCA recoveries hit a record $6,800,000,000 in FY 2025, with 1,297 qui tam filings - the highest in U.S. history. JIL's in-house Tier 1 + Ava stack ran on the live federal data backbone (28 sources, 145,879 inpatient records, 27 CERT detectors, OFAC SDN, NCD/LCD, MAC jurisdictions, and 22 more) and surfaced $1.18B in cohort-level overage in under 12 seconds. Ava's archetype routing then collapsed that to a $48K Tier 2 plan on the 6 candidates that warrant substantiation.

Three things this POC demonstrates:

  1. JIL's in-house ingestion + Ava run on real federal data at scale. 28 sources, daily / weekly / quarterly / annual cadences, pre-load + post-load dedup, all in-house pipeline.
  2. Statistical detection without an agentic layer is noise. Every academic medical center looks suspicious if you only run Tier 1. Ava's archetype calibration + cost-aware Tier 2 routing is what turns Tier 1 outliers into actionable cases.
  3. The reach is broad. Hospitals + DRGs is one slice. Same Ava agent runs on hospice, SNF, HHA, DMEPOS, Part D, physician E/M. Same 28-source backbone, different cohort priors.
The flip: if JIL's in-house tech can synthesize $1.18B in cohort overage from public data alone in under 12 seconds and route it intelligently, the question for any MCO or state Medicaid program is not "do we need a Tier 1 engagement?" - it is "do we already have findings sitting on a qui tam target list, and do we know which of them Ava would dispose of vs send to evidence?"

Detect early. Prove it. Stay safe.

This POC ran on JIL's in-house ingestion of 28 federal sources. The full Tier 2 stack (bank fingerprinting, FinCEN BOI, Street View AI, ATTOM premise records) ships with the customer engagement under BAA + GLBA + per-engagement legal-basis authorization. CREB(TM) output is FRE 902(14)-anchored and reproducible in discovery.