Platform · data sources

Every data source the JIL platform ingests, by LOB and refresh cadence.

JIL is a verification network. The integrity of every CREB we seal traces back to the data we ingested, when we ingested it, where the source is, and which line of business consumed it. This page lists all of them. Federal public datasets (no contract required), commercial subscriptions (when the customer engagement requires depth a public source cannot reach), and customer-supplied records (under BAA / GLBA basis). Replay-grade transparency, not vague positioning.

52
Total catalogued sources
38
Free public + sanctions feeds
9
Commercial subscription paths
8
LOBs served on the same backbone
Legend: LIVE ingested in production WIRED code in place; pending DUA, key, or live data PENDING not yet implemented commercial subscription required
Section 01 · Public federal datasets

Public, free, no contract required.

Every row below pulls live from a federal data publisher. No subscription, no DUA, no per-record licensing. JIL ingests, hashes the source file for replay, indexes into postgres, and runs the LOB-specific check pack. These are the sources behind the eight live POC pages and the CMS attestation backbone.

Source Provider Refresh Format LOB(s) Status Live POC
Fails-to-Deliver RegisterSEC FOIAMonthly (a/b half-files)pipe-delimitedcapmarketsLIVEcapmarkets-poc (339K rows)
USAspending.gov APITreasury / OMBReal-time / dailyJSON RESTgrants federal-investigatorLIVEgrants-poc (1K awards / $2.96T)
DOL OFLC LCA DisclosureDOL ETAQuarterlyXLSXh1bLIVEh1b-poc (337K real LCAs)
UN Comtrade APIUN Statistics DivisionAnnual / QuarterlyJSON RESTtrade-financeWIREDtrade-finance-poc (rate-limited; synthetic backstop)
USCIS Regional CentersUSCISAd-hocHTML / PDFeb5 federal-investigatorWIREDeb5-poc (anon-blocked; synthetic backstop)
USCIS Data Hub processing-timesUSCISQuarterlyJSON RESTeb5WIREDeb5-poc
NHTSA FARS (Fatality Analysis Reporting System)NHTSAAnnualCSV (zipped)pcPENDINGpc-poc (build queued)
BLS Occupational Injuries (SOII)Bureau of Labor StatisticsAnnualCSV / APIwcPENDINGwc-poc (build queued)
CMS Medicare Inpatient by Provider+ServiceCMSAnnualCSVMCO federal-investigatorLIVEava-poc (145K rows / $90.94B)
CMS Outpatient by Provider+APCCMSAnnualCSVMCOLIVEava-poc (117K rows)
CMS DMEPOS by Referring ProviderCMSAnnualCSVMCOLIVEava-poc (498K rows)
CMS Part D Prescriber by DrugCMSAnnualCSVMCOLIVEava-poc (476K rows)
CMS Provider of Services (POS) fileCMSQuarterlyCSVMCO federal-investigatorLIVEava-poc (44K rows)
CMS Hospice utilizationCMSAnnualCSVMCOLIVEava-poc (5,772 rows)
NPPES (NPI Registry)CMSWeeklyCSV bulkMCO all KYCLIVEava-poc
CERT FY2024 detector libraryCMSAnnualinternal seedMCO federal-investigatorLIVEava-poc
CMS Owners file (regional centers, ownership)CMSQuarterlyCSVMCOWIREDUBO graph
PECOS (Provider Enrollment Chain & Ownership)CMSQuarterlyinternal feedMCO federal-investigatorWIREDUBO graph
MAC jurisdiction mapCMSQuarterlyinternal seedMCOLIVEava-poc
Etherscan public API (token transfers)EtherscanBlock-level (~12s)JSON RESTp2p wallet-intelLIVEp2p-poc (1K USDC transfers)
SEC EDGAR (filings)SECReal-timeJSON / XBRLcapmarkets asset-intelWIRED
TreasuryDirect / FFIEC bank financialsTreasury / FFIECQuarterlyCSVcapmarketsWIRED
Section 02 · Public sanctions, identity, exclusions

Cross-vertical compliance feeds.

These feed every LOB. Identity, sanctions, exclusions, beneficial-ownership lookups. Most are free; OpenCorporates carries a free tier for low volume and a paid tier for entity-resolution at scale.

Source Provider Refresh Type LOB(s) Status
OFAC SDN ListTreasury OFACDailyPublic freeall KYC p2p trade-financeLIVE
UN Consolidated SanctionsUN Security CouncilAd-hocPublic freeall KYCLIVE
HMT (UK) Consolidated ListHM Treasury UKDailyPublic freeall KYCLIVE
EU Consolidated Financial SanctionsEU CommissionDailyPublic freeall KYCWIRED
OpenSanctions / YenteOpenSanctionsDailyPublic freeall KYCLIVE
OIG LEIE (excluded individuals)HHS OIGMonthlyPublic freeMCO federal-investigatorWIRED
SAM.gov exclusionsGSADailyPublic freegrants federal-investigator all vendorWIRED
GLEIF LEI RegistryGLEIFDailyPublic freeall institutionalLIVE
FinCEN Boi Reporting (when published)FinCENReal-timePublic freeall KYBPENDING
FINRA disciplinary databaseFINRAReal-timePublic freecapmarketsWIRED
CFTC enforcement databaseCFTCReal-timePublic freecapmarkets trade-financeWIRED
DOJ enforcement / qui tam relator recordsDOJReal-timePublic freefederal-investigator MCOWIRED
OpenCorporates (entity registry)OpenCorporatesReal-time APIFree + paid tiereb5 all KYBWIRED
RDAP domain age + WHOISICANN / registrarsReal-timePublic freeall BECLIVE
Section 03 · Commercial subscriptions

Paid feeds for engagement-grade depth.

Tier 2 of the JIL economic model brings these in on a per-engagement basis. We do not carry the subscription cost as a fixed overhead; the customer engagement either funds the data path or chooses a public-data-only Tier 1 baseline. Every paid feed below has a public-data fallback or is optional for the verticals that consume it.

Source Provider Refresh Cost band LOB(s) Status
Bloomberg Terminal dataBloombergReal-timecapmarkets asset-intelPENDING (engagement-funded)
Refinitiv (LSEG) market referenceLSEGReal-timecapmarketsPENDING (engagement-funded)
Chainalysis KYT / ReactorChainalysisReal-timewallet-intel p2pPENDING (customer rides their own)
TRM LabsTRM LabsReal-timewallet-intel p2pPENDING (customer rides their own)
ATTOM Property + Address IntelligenceATTOM DataDailyMCO pcWIRED
Etherscan Pro (higher rate limit)EtherscanReal-timep2p wallet-intelWIRED (using free tier today)
Helius RPC + DAS APIHeliusReal-timewallet-intel p2pWIRED
Plaid (banking data)PlaidReal-timeMoney PassportPENDING
IRS 4506-C IVESIRSOn-demand per requestMoney PassportPENDING (IVES participant approval)
Why not subscribe to everything up-front. Each commercial feed is a fixed cost JIL would have to spread across customers regardless of whether their engagement actually exercises that path. Instead we run Tier 1 entirely on public-data feeds, surface findings, and only activate the relevant paid feeds at Tier 2 when the customer engagement specifically calls for them. This keeps the platform's gross-margin profile institutional-grade and aligned with the four-SKU pricing model (no contingency, no per-recovery percentage).
Section 04 · Customer-supplied records

Under BAA, GLBA, or comparable basis.

Customer-supplied records never leave the customer's perimeter. Verdict-engine ingestion runs inside the customer's tenant or against a read-only adapter on the customer's side. JIL receives only the signed verdict record and case-file artifacts, not the underlying data.

Settlement records

capmarkets Trade records, SWIFT 5xx messages, FIX, ISO 20022 sese. Custodian / broker / fund-admin sources. Real-time stream when paid engagement is active.

Position files

capmarkets Daily position records from each system that should agree (custodian, broker, fund-admin, CSD). Cross-system reconciliation runs against this set.

Bank wire records

all Pre-Settlement Outbound wire instructions intercepted before release. Sub-2-second YES / NO / REVIEW gate.

MCO claim records

MCO Provider claim files, encounter records, prior-authorization decisions. PHI; under BAA. Tier 2 claim integrity work.

H-1B beneficiary documents

h1b Sponsor-supplied labor condition files, payroll attestations. Optional Tier 2 deepening.

Workers' comp claims

wc Carrier-supplied claim event records, medical bills, employer records. Tier 2 only.

P&C claim files

pc Carrier-supplied claim event records, repair estimates, photos, telematics. Tier 2 only.

Trade finance documents

trade-finance Letters of credit, bills of lading, customs declarations. Bank-supplied under BAA-equivalent for cross-border ops.

Section 05 · Refresh cadence summary

How fresh the verdict is, by source class.

Real-time / block-level

Etherscan, EDGAR, OFAC SDN delta, USAspending API, OpenSanctions, OpenCorporates API, GLEIF, RDAP. Latency from publication to JIL findings: seconds to minutes.

Daily

OFAC SDN full refresh, HMT UK, EU, NPPES delta, SAM.gov exclusions, ATTOM. Standard cron pulls.

Weekly

NPPES bulk, OIG LEIE delta, sanctions consolidation. Tuesday-night cron.

Monthly

SEC fails-to-deliver register (a/b half-files), OIG LEIE full, MAC jurisdiction. Calendar-month rollover.

Quarterly

DOL OFLC LCA disclosure, USCIS Data Hub processing-times, CMS POS file, PECOS, FFIEC bank financials. Calendar-quarter rollover.

Annual

NHTSA FARS, BLS SOII, CMS Inpatient / Outpatient / DMEPOS / Part D / Hospice / SNF, CERT detector library. Calendar-year rollover; lag of 6-18 months from end of year.

Section 06 · Per-LOB source breakdown

Which sources each LOB consumes.

capmarkets LIVE

SEC FTD, EDGAR, FFIEC, FINRA, CFTC, GLEIF + customer settlement records. Optional Tier 2: Bloomberg / Refinitiv.

grants LIVE

USAspending.gov, SAM.gov exclusions, OFAC SDN, GLEIF, OpenCorporates + customer-supplied awardee records.

h1b LIVE

DOL OFLC LCA, USCIS, OFAC, GLEIF, OpenCorporates + sponsor-supplied wage records.

eb5 LIVE

USCIS Data Hub, USCIS regional centers, SEC EDGAR, OFAC, OpenCorporates + investor source-of-funds documentation.

p2p LIVE

Etherscan, OFAC SDN crypto-address attribution, OpenSanctions + customer transaction records. Optional: Chainalysis / TRM.

trade-finance LIVE

UN Comtrade, OFAC, GLEIF + bank-supplied trade documents.

pc PENDING

NHTSA FARS + carrier-supplied claim records. Optional ATTOM for premise.

wc PENDING

BLS SOII, NPPES (medical providers) + carrier-supplied claim records.

MCO · Medicare / Medicaid LIVE

Full CMS stack (Inpatient, Outpatient, DMEPOS, Part D, POS, NPPES, OIG LEIE, MAC, CERT) + customer claim records under BAA.

Section 07 · Replay and audit

Every CREB carries the source manifest.

Each CREB-anchored finding embeds a reproducibility manifest that lists the exact source-file hash, ingest timestamp, code version, and signal threshold used. A regulator, auditor, or counterparty can replay the analysis bit-identically using the same federal source file plus the manifest. The data-source pages above are indexed by the same manifest fields.

Cross-references. See CMS Data Source Map for the deep CMS-specific federal stack. See Attestation Checks (148) for the per-check data-source dependency. See Sample CREB for the manifest format.