Methodology · v0.5 · May 2026 · CISM-led

How each exposure index
is actually computed.

Five focused indices — CES, IMI, RPS, BLV, BIR — each with a published formula, an enumerated source whitelist, weighted inputs, confidence intervals, and a change log, rolled into one headline composite: the CXI. This page is the contractual definition. The internal collector implementation lives behind it; the math does not.

CXI CES IMI RPS BLV BIR

01·Principles

What the methodology commits to before any individual index.

Quantification before collection

We instrument fewer sources than generalist threat-intel vendors. Every collected signal must feed a published index. Volume is not the product.

Comparable across clients

Indices are normalized so peer benchmarks are meaningful at N ≥ 5. Two firms with the same CES represent comparable credential exposure, controlled for workforce size and sector.

Confidence bands, always

Every reported value ships with an 80% confidence interval. We name the model, the inputs, and the assumptions behind each band.

Public + licensed sources only

Collection sources are explicitly enumerated. No invite-only forums, no authentication bypass, no interaction with sellers. The source whitelist is part of the contract.

Identifiers hashed at ingest

Emails, usernames, and any customer or employee identifier are SHA-256 hashed with a rotating salt before they hit our storage. Dereferencing requires verified domain ownership and a documented lawful basis.

Version-anchored, audit-replayable

Every index output carries the methodology version (currently v0.5). Auditors can independently recompute any historical value from preserved raw signals.

02·The composite, then the five indices

One headline number, then each index it rolls up.

CXI

Cyber Exposure Index

Headline composite

A single 0–10 board-level number: a client-weighted roll-up of the five indices below. The CXI introduces no new collection — it is strictly a function of CES, IMI, RPS, BLV, and BIR. Internal collection sub-signals are absorbed into their parent index before the roll-up; none are reported as standalone indices.

Formula

CXI = Σ_i  w_i · normalize(index_i)     i ∈ {CES, IMI, RPS, BLV, BIR}

Each index is normalized to a common 0–10 scale, weighted, and summed. Weights sum to 1.0.

Default weights

CES 0.30 · RPS 0.30 · IMI 0.19 · BIR 0.11 · BLV 0.10

Per-client weights are set at onboarding to match the buyer's risk priorities and audited quarterly. The applied vector ships with every report. Re-weighting is a client-level configuration, not a methodology version change.

Caveats

·A high reading on one index is surfaced as a standalone alert, never averaged away by a calm composite.
·Peer percentile on the CXI requires a cohort of N ≥ 5; below that we report the absolute value and trend only.
·The composite carries the widest confidence band of any contributing index — it never reads tighter than its inputs.

CES

Credential Exposure Score

# permalink

Volume and severity of credentials tied to the client's domains observed in licensed stealer-log feeds and public breach combolists during a rolling window.

Decision it enables

Force proactive password resets by cohort. Reduce account-takeover incidents 40–70% in the first quarter.

Reporting

Volume per 30-day window. Severity split: high (active session/cookies/saved-card), medium (recent email+password), low (older email-only). 30/60/90-day trends.

Formula

CES_window = N_new_creds + β · N_high_severity_creds

New credentials observed in window, plus a β-weighted bump for high-severity records (active sessions, captured cookies, saved-card flags).

Inputs and source weights

Licensed stealer-log providers (SpyCloud, Constella, Flare)

high

RedLine, Lumma, Vidar, Stealc, Raccoon families

Public combolists (deduplicated, hashed)

medium

Lower confidence than stealer logs

Public breach dumps

low

Recency-weighted only

Aggregation

Reported as cohort-level (department, ASN, age band) for actionable reset decisions. Per-identifier dereferencing requires verified domain ownership.

Caveats

·β default = 2.4. Tunable per client based on ATO baseline and authentication posture.
·Stealer-log volume tends to over-represent newer breaches; we apply temporal smoothing.
·Cross-vendor deduplication is non-trivial — we report consolidated counts after a 3-vendor agreement check.

IMI

IAB Mention Index

# permalink

Volume and severity of initial-access-broker listings referencing the client across enumerated public-source channels and licensed dark-web feeds.

Decision it enables

Notify and harden the perimeter before brokered access is sold to a ransomware affiliate or downstream operator.

Reporting

Listings count, severity distribution, seller-reputation banding, channel-level breakdown. P0 severity drives immediate alert.

Formula

IMI_window = Σ_listing severity_l · reputation_l · recency_l

Sum across listings of severity (access type, footprint described), seller reputation, and recency.

Inputs and source weights

Licensed dark-web data partners

high

DarkOwl, Flare, Constella — IAB-tagged listings

Public stealer-log markets (archive mirrors)

high

Observed only — no purchases, no DMs

Public forum mirrors (archive-indexed)

medium

Vendor track record, vouches

Public Telegram channels (resale)

medium

Channel reach, message recency

Aggregation

A high-reputation seller's single listing outweighs many anonymous low-vouch posts. Visibility factor scales by seller reach.

Caveats

·We observe only. No interaction with sellers, no test purchases, no fake-buyer personas.
·Channel coverage is enumerated in the contract; we declare per-channel observability in the methodology change log.
·Cross-listing identity attribution is probabilistic — we report confidence bands on attribution.

RPS

Ransomware Proximity Score

# permalink

Exposure to active ransomware crews — direct leak-site mentions, supplier-overlap with known victims, and cohort-relative incident clustering.

Decision it enables

Quantify ransomware exposure for cyber-insurance pricing, vendor reviews, and board-level risk reporting.

Reporting

0–10 score with 80% CI. Cohort percentile (P50/P90/P99) within sector. Trend over rolling 90 days. Top three contributing signals listed.

Formula

RPS = w₁ · direct_mention + w₂ · supplier_overlap + w₃ · cohort_clustering

Weighted combination of direct leak-site mention severity, supplier-graph overlap with known victims, and cohort-relative clustering of recent incidents.

Inputs and source weights

Public ransomware leak sites (Tor mirrors via observation)

high

Mention severity, post recency, data-volume claimed

Licensed dark-web data partners

high

Crew attribution, supplier-graph context

Public incident disclosures (8-K, news, regulator filings)

medium

Cohort-clustering inputs

Public certstream / DNS history

low

Infrastructure-side proximity signals

Aggregation

Supplier-graph overlap is computed from public corporate registrations + sector mappings; client confirms the supplier list as part of onboarding.

Caveats

·Supplier-graph completeness depends on client-side confirmation. We publish coverage statistics per industry.
·Direct-mention severity is conservatively scored: 'data sample' is weighted lower than 'full exfil claimed.'
·We do not interact with ransomware operators. No negotiation, no validation purchases, no contact.

BLV

Brand Leak Velocity

# permalink

Rate at which proprietary brand assets, internal documents, and identifiable data appear across public leak channels and paste sites.

Decision it enables

Prioritize takedown queues and legal escalation when leakage accelerates beyond the established baseline.

Reporting

Artifacts per day with 7/30/90-day trailing baselines. 3σ spikes trigger P1 alert. Severity split by artifact type.

Formula

BLV = (Δ_new_artifacts / Δt) · severity_weight · audience_weight

First derivative of new artifact appearances, scaled by severity (content type) and the audience reach of the channel.

Inputs and source weights

Public paste sites (Pastebin-class, indexed mirrors)

high

Document fingerprints, brand strings

Public Telegram channels and forums (archive observation)

high

Re-post velocity, subscriber-weighted reach

Public ransomware leak-site posts

high

Triangulated against RPS direct-mention signal

Licensed dark-web data partners

medium

Document-mention attribution

Aggregation

Artifact attribution uses content fingerprinting plus brand-string detection. Cross-channel duplicates are deduplicated before velocity is computed.

Caveats

·Fingerprinting requires a sample set of internal documents at onboarding (or, alternatively, signed brand-asset strings).
·Velocity baseline stabilizes after 14 days of observation; first reported value carries an explicit shrinkage flag.
·We do not access leaked content beyond what is needed for attribution; full content remains on the publishing channel.

BIR

Brand Impersonation Reach

# permalink

Audience reach of typosquat domains, mirror sites, and impostor social profiles imitating the client — weighted by estimated traffic.

Decision it enables

Direct defensive registrar spend, platform reporting, and takedown allocation; inform SEO/PPC spend against impersonators.

Reporting

Active domain count + social impostor count + estimated reach (monthly visitors / followers). Per-domain risk classification.

Formula

BIR_window = Σ_d log₁₀(estimated_traffic_d + 1)

Log-scaled sum of estimated traffic across detected impersonator surfaces.

Inputs and source weights

Certstream / WHOIS feeds (domain registration)

high

Live monitoring of typosquat candidates

Traffic estimation (DNS resolution + observed referrer signals)

high

Audience reach proxy

Social platform public APIs (impostor account detection)

medium

Follower count, recency, profile content

ASN clustering (mirror-operator correlation)

medium

Detects single-operator mirror networks

Aggregation

Operator-cluster detection: typosquats sharing a hosting ASN within a window are correlated and reported as a single operator surface.

Caveats

·Traffic estimates carry meaningful uncertainty (±35% typical). We always report the range, not a point.
·Social platforms vary in API access; some have lower observability — we declare per-platform coverage in the methodology change log.
·Defensive domain registration recommendations are advisory; the legal go-decision rests with the client.

03·Source whitelist

Enumerated. Contractual.

The sources we collect from are part of the contract. Adding a source requires a methodology version bump and a notice to existing clients. Removing a source likewise. Source weights are reviewed quarterly.

Licensed dark-web data partners

CES+IMI+RPS · high

DarkOwl, SpyCloud, Constella, Flare under contract

Public stealer-log markets

CES+IMI · medium-high

Observed via archive mirrors; no purchases

Public ransomware leak sites

RPS+BLV · high

Tor mirror observation; victim postings, data-claim metadata

Public paste sites + forum mirrors

BLV+IMI · medium-high

Pastebin-class, archive.org-indexed boards

Public Telegram channels

IMI+BLV · medium

MTProto public-channel API; no group infiltration

Certstream + WHOIS

BIR · high

Domain-registration monitoring, typosquat detection

Public incident disclosures

RPS · medium

SEC 8-K, regulator filings, news; cohort-clustering input

DNS / passive-DNS / ASN feeds

BIR · medium

Infrastructure-side correlation, mirror-operator clustering

04·Identifier handling

Hashed at ingest. Dereferencing requires verified ownership.

The minimum-necessary principle is enforced at the collector boundary. Plaintext identifiers are normalized, hashed with SHA-256 + a salt rotated every 90 days, and discarded. Only the hash + metadata (source, observation time, severity flags) is persisted.

What we store

+SHA-256(identifier + current_salt)
+Source family + observation timestamp
+Severity flags (active session, captured cookies, etc.)
+Hash-of-hash for cross-client peer benchmarks

What we never store

×Plaintext emails, usernames, or passwords
×Payment-card fragments (dropped at parser)
×Personally-identifying content from leaked documents
×Any signal touching CSAM (see red lines)

Verified-ownership dereferencing — for instance, a CISO needing to notify affected employees — is bound to a documented lawful basis (GDPR Art. 6(1)(f) + 34) and is logged for audit. The dereferencing endpoint is rate-limited and tenant-scoped.

05·Peer-cohort anonymization

N ≥ 5. No exceptions.

Peer benchmarks (the P<value> percentile shown in client dashboards) are computed across a bucket of firms with comparable characteristics: sector, workforce-size band, and revenue band. A benchmark is suppressed if the bucket size falls below five firms.

·Bucket definitions are deterministic and published with the methodology version.
·Per-client inputs to peer aggregates are hash-blinded; we cannot retrieve which raw value came from which client when computing the bucket statistic.
·Outputs to client A are statistically constructed so client A's own contribution cannot be re-identified from the published percentile.
·Cross-client raw data is never accessible to any individual client, employee, or analyst.

06·Confidence intervals

80% bands. Bayesian shrinkage on small-N.

Every index output ships with an 80% confidence interval. We chose 80% (not the more familiar 95%) deliberately — at 95%, intervals on small observation windows become uninformative wide bands that disguise real signal. 80% balances calibration with usability.

For clients with thin observation history (< 14 days) or narrow attack surfaces, we apply Bayesian shrinkage toward the peer-cohort prior. The prior is the cohort median with conservative weighting; the shrinkage weight is published alongside each affected output. As your observed data accumulates, the shrinkage decays automatically.

Reported as

RPS = 7.2 [6.7 – 7.6]   ← 80% CI
       ↑    ↑     ↑
   central lower upper

07·Versioning & change log

Every output carries the version it was computed with.

Methodology versions are immutable once shipped. Historical index values are recomputed-only on explicit request and the new values carry both the original version and the recompute version. Clients are notified at least 14 days before any minor version bump and 30 days before any major bump.

v0.6

2026-05

Shipped the free CES teaser: a verified-owner, single-index snapshot of one's own domain. Benchmarks against a documented modeled cohort prior (sector × workforce-size band) until a bucket reaches the N ≥ 5 threshold, at which point an observed peer percentile supersedes the prior. Modeled vs. observed basis is labeled on every output — a prior is never presented as an observed percentile.

v0.5

2026-05

Published the CXI headline composite — a client-weighted roll-up of the five indices into a single 0–10 board number. Default weights documented; per-client weights are set at onboarding and audited quarterly. Internal collection sub-signals are absorbed into their parent index upstream and are never reported as standalone indices.

v0.4

2026-05

Added supplier-graph overlap signal to RPS. Refined β weighting for CES high-severity records. Cross-listing identity attribution now reported with confidence bands.

v0.3

2026-03

Introduced peer-cohort benchmarks (N ≥ 5). Added 80% CI reporting standard across all five indices. Cross-vendor deduplication agreement check formalized.

v0.2

2026-01

Brand-leak fingerprinting added (document hashing + brand-string detection). Hash-rotation salt schedule formalized (90-day rotation).

v0.1

2025-11

Initial release with five indices: CES, IMI, RPS, BLV, BIR. Methodology baseline.

08·Red lines

What this methodology will never compute.

The collection boundary is part of the math. Outputs are only as trustworthy as the inputs that produced them — these rules bound the inputs.

×No authentication bypass. No invite-only forums. No vouched-access communities.
×No buying stolen accounts or leaked data 'to validate.' We observe; we do not interact with sellers.
×No plaintext PII to clients. Hashes + metadata only.
×No cross-client data leakage. Peer benchmarks require N ≥ 5 in a bucket.
×No doxxing. No offensive OSINT. No HUMINT. No undercover personas.
×Zero tolerance for CSAM. Automated detection → immediate NCMEC report → zero retention.

Questions on the math?

We respond to methodology questions in writing.

CISO, head of risk, or anyone on the legal side — send the specific question and we'll respond inside 48h. First-cohort tier conversations get a longer-form methodology brief on request.

Request a methodology session

Independent audit

Built to be re-computed by a third party.

Every index output is anchored to (a) the methodology version, (b) the inputs at the time of computation, and (c) a SHA-256 hash of the evidence bundle. An auditor with read access can independently replay any historical value. Methodology audits are part of the first-cohort onboarding.