Compliance software for MGAs: a guide to verified accuracy
How MGAs should evaluate compliance software in 2026. The difference between a sanctions screen, a verified-accuracy compliance layer, and a regulator-grade audit pack — and what to demand from any vendor.
TL;DR
"Compliance software" sold to MGAs in 2026 spans three completely different categories — sanctions screening, appetite-policy enforcement, and audit-grade decision tracing. Most vendors do one. The few that do all three are the ones a state-DOI inquiry won't blow up.
The three jobs MGA compliance software actually does
### Job 1 — Sanctions / KYC screening
Match the insured (and broker, and UBO) against OFAC SDN, EU consolidated, UK HMT, UN, and PEP lists. Flag matches above a configurable score threshold. Return a verifiable record of the screen.
Verified accuracy here means:
- The exact list version and date the screen was run against
- The score for each near-match
- A linkable record on the source registry (e.g. the OpenSanctions entity URL)
If your vendor returns "no match" without showing you the list version, the score, and the source — you don't have verified accuracy. You have a green checkmark.
### Job 2 — Appetite / regulatory rule enforcement
Apply the MGA's underwriting authority and any state-by-state regulatory constraints. Flag when a submission falls outside delegated authority, requires senior referral, or hits a regulatory carve-out (admissibility, surplus lines tax, state DOI filings).
Verified accuracy here means:
- The rule that fired is named explicitly
- The rule's source (the underwriting agreement, state regulation, internal appetite memo)
- The version of the rule at the moment the decision was made
- An auditable record of any underwriter override, with notes
### Job 3 — Audit-grade decision tracing
For every bind / decline / refer / query, the MGA must be able to answer a regulator's "show me the reasoning" question with an exportable record. This isn't a log file — it's a structured pack: prompt versions, model outputs, citations, agent ratings, underwriter notes, timestamps.
Verified accuracy here means:
- The decision is reproducible from the inputs alone
- Every external data lookup (flood zone, sanctions, firmographics) has a provenance entry
- Every LLM call has a model name, prompt hash, and token count stored immutably
- Every underwriter override is captured
Why "verified accuracy" is the new bar
State DOIs and Lloyd's coverholder oversight have both raised the bar in 2025–26. The expectation now is:
- Every decision is explainable without the underwriter being in the room
- Every external claim ("the flood zone is X") is traceable to a source
- Every model output is versioned so old decisions can be re-played against the prompt of the day
- Every override is logged with a rationale
If a vendor demos compliance as "we screen the insured against OFAC" — they're solving 1 of the 3 jobs. Don't sign yet.
How to evaluate compliance vendors
A 30-minute demo question list:
1. Show me a real screen result. Not a marketing slide. A real result with score, list version, source URL, and timestamp. 2. Walk me through the audit pack for a specific bound risk. Show me the prompt versions, model outputs, citations, override history. 3. What happens when an external API (OpenSanctions, FEMA NFHL) fails? A real platform has a documented failure mode. A toy platform crashes silently. 4. Show me how a rule version is captured. Add a new appetite rule live in the demo. Show me the diff. Show me where it's stored. Show me how an old decision references the old version. 5. Export an audit pack to PDF. If they can't, they can't answer a state-DOI inquiry either.
If a vendor flinches at #4 or #5, they don't have verified accuracy.
The most common failure modes
Failure 1 — LLM-only sanctions check. The vendor asks the LLM "is this entity sanctioned?" The LLM hallucinates "no" and the audit trail is a chat log. This will not survive an inquiry.
Failure 2 — Snapshot-only audit. The vendor stores "the answer" but not the inputs, prompt version, or external-data state at the moment of decision. Regulator asks for re-play; you can't.
Failure 3 — Hardcoded rule library. The vendor ships a fixed appetite library that the MGA can't edit. Six months in, the rules drift from your actual underwriting authority. Audit fails.
Failure 4 — No PII redaction. Insured names and broker emails leak into model traces. Disclosable. Embarrassing.
What a verified-accuracy compliance stack looks like
Concretely:
- Sanctions: OpenSanctions consolidated dataset (OFAC SDN, EU, UN, UK HMT, PEPs), SEC EDGAR for public-filer corroboration, Wikidata for adverse-media seed. Scored matches. Linkable provenance.
- Appetite enforcement: Per-MGA editable rule library. Rules versioned, diff-able, rolled back. Override audit trail per submission.
- Decision tracing: Per-step LLM prompt + output + token usage stored. External-data lookups timestamped + provenance-tagged. Underwriter overrides captured with rationale. Exportable to PDF / regulator format.
- PII handling: Insured names, broker emails, financial account numbers redacted in agent traces by default. Optional un-redacted view behind RBAC.
How Vortic ships this
- Live OpenSanctions screen on every submission (Integrated Data tab)
- Live SEC EDGAR + Wikidata cross-checks for the insured
- Per-agent prompt versioning + diff
- Per-step audit trace stored in
agent_traces(immutable) - Provenance footer on every external data lookup (FEMA NFHL, NOAA, USGS, etc.)
- Override capture with notes
- Decision PDF export on every bind/decline/refer/query
This isn't optional in our stack — it's the default.
Closing thought
If "compliance software" doesn't give you verified accuracy across all three jobs — sanctions, appetite, audit tracing — it's a marketing label, not a compliance stack. Build your evaluation around the five demo questions above. Walk if a vendor can't answer them.