chaindrainMVP

Methodology

A transparent, no-ML risk model. This page documents how each tier is computed, what we collect, what we detect, and how that turns into alerts you can act on.

1. How risk tiers are calculated

Every entity is scored once per day and bucketed into one of four tiers. Bands are calibrated against the live distribution across all tracked companies.

Critical≥ 0.65 · 59 entities

High-TVL protocols with mutable code, weak audit coverage, or limited bug-bounty programs. The daily agent watches these closely; every dependency degradation produces a fan-out alert.

High0.50 – 0.65 · 69 entities

Material exposure but at least one offsetting control (mature audits, sizeable bounty, or immutable core). Worth monitoring; alerts when paired with high blast radius.

Medium0.30 – 0.50 · 705 entities

The long tail — lower TVL or strong defensive posture. Alerts only fire when a signal directly references one of these entities or a shared dependency.

Low< 0.30 · 42 entities

Immutable, well-audited, low-TVL, or well-bountied. Background watch — no proactive alerts.

2. The risk-score formula

Weighted linear composite. Weights are published, not learned — they were chosen as defensible priors and will be backtested once the Incident Ledger lands.

risk_score = 0.4 · tvl + 0.3 · mutability + 0.2 · audit + 0.1 · bounty

TVL factor

weight 0.4

log10(tvl_usd) normalized to 0–1

Bigger pool = bigger target. Empirically the dominant variable in historical exploit selection.

Mutability factor

weight 0.3

1.0 if proxy + EOA admin · 0.7 if proxy + contract admin · 0.3 if proxy + multisig · 0.0 if immutable

Mutable code means a larger attack surface (admin key compromise) and a faster un-detected silent change.

Audit factor

weight 0.2

Inverse of DefiLlama audits_tier (0–5); 1.0 if tier 0/unknown, 0.0 if tier 5

Strongly correlated with exploit rate in historical data. No tier ≈ no public assurance.

Bounty factor

weight 0.1

Inverse log of bug_bounty_max_payout_usd

Protocols with no/small bounties get fewer whitehat reports → more zero-days reach prod.

3. What we collect

Four tables, one row per company, joined on entity_id. Every field has an explicit reason to exist — anything that could not justify the maintenance cost was deferred.

Identity

Who the entity is and where it lives.

nameCanonical brand label, deduplicated across sub-products.
sectorPrimary business category — drives default coverage tier.
chain_deploymentsCross-chain footprint; feeds chain-wide contagion alerts.
tvl_usdDirect input to the risk score; first-order indicator of exploit attractiveness.
defillama_slugJoin key for live TVL polling (24h delta drives the tvl_drop signal).
launch_dateAge proxy — newer protocols carry higher unknown-unknown risk.

Contract Fingerprint

How the code can change and who can change it.

proxy_patternTransparent / UUPS / immutable — feeds the mutability factor.
upgrade_authority_typeEOA vs multisig vs DAO — defines the silent-upgrade attack surface.
admin_addressWatched directly via Etherscan for live admin transactions.
audits_tierDefiLlama 0–5 audit grade; feeds the audit factor.
audit_firmsReputation signal beyond raw count; reused for similar-exposure clustering.
bug_bounty_max_payout_usdQuantitative bounty program strength; feeds the bounty factor.

Dependency Fingerprint

What this entity is exposed to — the contagion vectors.

oracle_providersLargest historical DeFi exploit class. Single-feed staleness fans out instantly.
bridge_dependencies$2.8B+ stolen via bridges 2022–2024; mapped per top entity.
stablecoin_dependenciesUSDC March 2023 / UST May 2022 — depeg propagates through this graph.
dvn_configurationLayerZero V2 trust assumption made explicit; surfaces 1-to-1 DVN risk.

Tier State (computed)

Outputs of the scoring leg, refreshed daily.

risk_score0–1 weighted composite. The number that ranks the dashboard.
risk_tierBucketed risk_score — critical / high / medium / low.
coverage_tierHow aggressively we watch — core / monitored / archive / excluded.
blast_radius_usdEstimated downstream exposure; the dollar-weighted fan-out impact.
stateLive operational status — active / degraded / paused / exploited / wound_down.

4. What we detect

Five live signal pollers run on a regular cadence. Each emits an alert with a severity and a fan-out — the list of every other entity that depends on the same key.

Signal	Source	Thresholds	Why it matters
Stablecoin depegcritical	CoinGecko /simple/price (USDC, USDT, DAI, FDUSD, USDS, USDe, USD0)	±0.5% deviation = high · ±2% = critical	Depeg events propagate through every protocol holding the affected reserve.
Oracle deviationhigh	Chainlink + Pyth ETH/BTC/LINK feeds vs. CoinGecko reference	1% = medium · 5% = high	Stale or manipulated price feeds caused Mango ($117M), Inverse ($15.6M), bZx ($55M).
Bridge pausecritical	LayerZero V2 paused() · Wormhole guardian heartbeat · Axelar maintainers	LayerZero paused = critical · Wormhole guardians <13 = critical · Axelar maintainers <3 = critical	Bridges are the single largest historical theft vector ($2.8B+ since 2022).
Admin transactionhigh	Etherscan txlist on the top 100 admin addresses by risk_score	EOA / multisig admin tx in last 5 min = high · contract admin = medium	Surfaces silent code-swap upgrades and ownership transfers in near-real time.
TVL dropcritical	DefiLlama /protocols change_1d	−20% 24h = high · −40% = critical	First public sign of an exploit, run, or governance failure for any protocol with a slug.

5. How this helps you

Prioritized watchlist

775 deduplicated companies ranked by a transparent risk score so the daily agent watches 50 closely instead of 875 evenly.

Instant fan-out

When a dependency degrades — oracle, bridge, stablecoin, admin, or TVL — Chaindrain shows every entity exposed to that exact dependency, ordered by blast radius in USD.

Daily digest at 09:00 UTC

Every alert from the previous 24 hours, bucketed by severity, with the top affected entities inline. Empty windows are silent — no noise.

6. Exposure Graph & Similarity Engine

The Exposure Graph tab joins three layers — Layer 1 enrichment (identity / contract / dependency / governance / reputation), Layer 2 incident ledger, and Layer 3 similarity — into one per-entity view with three explainable panels: Threat History, Peer Incidents (Method B), and Dependency Twins (Method A + C ensemble).

Method A · Weighted Jaccard

weight 0.30

Compares 10 attribute sets — audit firms, oracle providers, bridge dependencies, stablecoin dependencies, chain deployments, LST/LRT deps, subsector tags, DVN config, KMS provider, frontend host. Best at “same shape of stack.”

Method B · Incident overlap

weight 0.40, normalised min(1, count / 5)

Counts shared vulnerability-class incidents derived from the 24 root-cause predicates. Best at “has been bitten by the same kind of attack.” This is the only method that fires on real history rather than configuration shape.

Method C · Deterministic embedding cosine

weight 0.30

64-dimensional pseudo-embedding built from concatenated SHA-256 hashes of the same attribute bag. Stands in for a real embedding model until Phase 3c swaps it for OpenAI text-embedding-3. Best at catching semantic neighbours that share no exact tokens (e.g. RWA-credit ↔ tokenised treasury).

Ensemble & worked example

For every source entity the engine scores 771 candidate targets, keeps the top 25 by ensemble = 0.30 · A + 0.40 · min(1, B/5) + 0.30 · C, and writes them to chaindrain.similarity_pair. Today the universe is 772 entities × 25 twins = 19,300 rows; compute finishes in < 4s on a single Node process.

RealT (RWA / tokenised real-estate) currently lists BlackRock BUIDL, Kelp DAO, Backed Finance, and Lift Dollar (USDL) in its top-5 — all peer real-world-asset issuers with shared credit /real_estate subsector tags, Chainlink oracle, and stablecoin-backed redemption rails. Method B contributes 2–3 shared incident root-causes per pair (private_key_leak,counterparty_default, rounding_precision).

What is synthetic today

Incident ledger — 356 deterministically generated incidents conditioned on the 24 root-cause predicates. Phase 2a replaces this with DefiLlama Hacks / Rekt / SlowMist ingestion.
Layer 1 enrichment at confidence DEMO or INFERRED — every field that shows a Demo / Inferred pill is synthetic. Phase 1b and Phase 2b backfill real custodian / KMS provider / RPC / frontend host / governance / GitHub data.
Method C embeddings — the 64-dim SHA-256 vectors are deterministic but carry no learned semantics. Phase 3c swaps them for OpenAI text-embedding-3-small.

See ~/Downloads/chaindrain_exposure_graph_scope.md in the repository for the full Phase 6 scope and the linked roadmap phases 1b / 2a / 2b / 3a / 3b / 3c.