Methodology · v0.4 · May 2026 · CISM-led

How each exposure index
is actually computed.

Five focused indices — CES, IMI, RPS, BLV, BIR — each with a published formula, an enumerated source whitelist, weighted inputs, confidence intervals, and a change log. This page is the contractual definition. The internal collector implementation lives behind it; the math does not.

01·Principles

What the methodology commits to before any individual index.

Quantification before collection

We instrument fewer sources than generalist threat-intel vendors. Every collected signal must feed a published index. Volume is not the product.

Comparable across clients

Indices are normalized so peer benchmarks are meaningful at N ≥ 5. Two firms with the same CES represent comparable credential exposure, controlled for workforce size and sector.

Confidence bands, always

Every reported value ships with an 80% confidence interval. We name the model, the inputs, and the assumptions behind each band.

Public + licensed sources only

Collection sources are explicitly enumerated. No invite-only forums, no authentication bypass, no interaction with sellers. The source whitelist is part of the contract.

Identifiers hashed at ingest

Emails, usernames, and any customer or employee identifier are SHA-256 hashed with a rotating salt before they hit our storage. Dereferencing requires verified domain ownership and a documented lawful basis.

Version-anchored, audit-replayable

Every index output carries the methodology version (currently v0.4). Auditors can independently recompute any historical value from preserved raw signals.

02·The five indices

Each index: formula, inputs, reporting, caveats.

01
CES

Credential Exposure Score

# permalink

Volume and severity of credentials tied to the client's domains observed in licensed stealer-log feeds and public breach combolists during a rolling window.

Decision it enables

Force proactive password resets by cohort. Reduce account-takeover incidents 40–70% in the first quarter.

Reporting

Volume per 30-day window. Severity split: high (active session/cookies/saved-card), medium (recent email+password), low (older email-only). 30/60/90-day trends.

Formula
CES_window = N_new_creds + β · N_high_severity_creds

New credentials observed in window, plus a β-weighted bump for high-severity records (active sessions, captured cookies, saved-card flags).

Inputs and source weights
Licensed stealer-log providers (SpyCloud, Constella, Flare)
high
RedLine, Lumma, Vidar, Stealc, Raccoon families
Public combolists (deduplicated, hashed)
medium
Lower confidence than stealer logs
Public breach dumps
low
Recency-weighted only
Aggregation

Reported as cohort-level (department, ASN, age band) for actionable reset decisions. Per-identifier dereferencing requires verified domain ownership.

Caveats
  • ·β default = 2.4. Tunable per client based on ATO baseline and authentication posture.
  • ·Stealer-log volume tends to over-represent newer breaches; we apply temporal smoothing.
  • ·Cross-vendor deduplication is non-trivial — we report consolidated counts after a 3-vendor agreement check.
02
IMI

IAB Mention Index

# permalink

Volume and severity of initial-access-broker listings referencing the client across enumerated public-source channels and licensed dark-web feeds.

Decision it enables

Notify and harden the perimeter before brokered access is sold to a ransomware affiliate or downstream operator.

Reporting

Listings count, severity distribution, seller-reputation banding, channel-level breakdown. P0 severity drives immediate alert.

Formula
IMI_window = Σ_listing severity_l · reputation_l · recency_l

Sum across listings of severity (access type, footprint described), seller reputation, and recency.

Inputs and source weights
Licensed dark-web data partners
high
DarkOwl, Flare, Constella — IAB-tagged listings
Public stealer-log markets (archive mirrors)
high
Observed only — no purchases, no DMs
Public forum mirrors (archive-indexed)
medium
Vendor track record, vouches
Public Telegram channels (resale)
medium
Channel reach, message recency
Aggregation

A high-reputation seller's single listing outweighs many anonymous low-vouch posts. Visibility factor scales by seller reach.

Caveats
  • ·We observe only. No interaction with sellers, no test purchases, no fake-buyer personas.
  • ·Channel coverage is enumerated in the contract; we declare per-channel observability in the methodology change log.
  • ·Cross-listing identity attribution is probabilistic — we report confidence bands on attribution.
03
RPS

Ransomware Proximity Score

# permalink

Exposure to active ransomware crews — direct leak-site mentions, supplier-overlap with known victims, and cohort-relative incident clustering.

Decision it enables

Quantify ransomware exposure for cyber-insurance pricing, vendor reviews, and board-level risk reporting.

Reporting

0–10 score with 80% CI. Cohort percentile (P50/P90/P99) within sector. Trend over rolling 90 days. Top three contributing signals listed.

Formula
RPS = w₁ · direct_mention + w₂ · supplier_overlap + w₃ · cohort_clustering

Weighted combination of direct leak-site mention severity, supplier-graph overlap with known victims, and cohort-relative clustering of recent incidents.

Inputs and source weights
Public ransomware leak sites (Tor mirrors via observation)
high
Mention severity, post recency, data-volume claimed
Licensed dark-web data partners
high
Crew attribution, supplier-graph context
Public incident disclosures (8-K, news, regulator filings)
medium
Cohort-clustering inputs
Public certstream / DNS history
low
Infrastructure-side proximity signals
Aggregation

Supplier-graph overlap is computed from public corporate registrations + sector mappings; client confirms the supplier list as part of onboarding.

Caveats
  • ·Supplier-graph completeness depends on client-side confirmation. We publish coverage statistics per industry.
  • ·Direct-mention severity is conservatively scored: 'data sample' is weighted lower than 'full exfil claimed.'
  • ·We do not interact with ransomware operators. No negotiation, no validation purchases, no contact.
04
BLV

Brand Leak Velocity

# permalink

Rate at which proprietary brand assets, internal documents, and identifiable data appear across public leak channels and paste sites.

Decision it enables

Prioritize takedown queues and legal escalation when leakage accelerates beyond the established baseline.

Reporting

Artifacts per day with 7/30/90-day trailing baselines. 3σ spikes trigger P1 alert. Severity split by artifact type.

Formula
BLV = (Δ_new_artifacts / Δt) · severity_weight · audience_weight

First derivative of new artifact appearances, scaled by severity (content type) and the audience reach of the channel.

Inputs and source weights
Public paste sites (Pastebin-class, indexed mirrors)
high
Document fingerprints, brand strings
Public Telegram channels and forums (archive observation)
high
Re-post velocity, subscriber-weighted reach
Public ransomware leak-site posts
high
Triangulated against RPS direct-mention signal
Licensed dark-web data partners
medium
Document-mention attribution
Aggregation

Artifact attribution uses content fingerprinting plus brand-string detection. Cross-channel duplicates are deduplicated before velocity is computed.

Caveats
  • ·Fingerprinting requires a sample set of internal documents at onboarding (or, alternatively, signed brand-asset strings).
  • ·Velocity baseline stabilizes after 14 days of observation; first reported value carries an explicit shrinkage flag.
  • ·We do not access leaked content beyond what is needed for attribution; full content remains on the publishing channel.
05
BIR

Brand Impersonation Reach

# permalink

Audience reach of typosquat domains, mirror sites, and impostor social profiles imitating the client — weighted by estimated traffic.

Decision it enables

Direct defensive registrar spend, platform reporting, and takedown allocation; inform SEO/PPC spend against impersonators.

Reporting

Active domain count + social impostor count + estimated reach (monthly visitors / followers). Per-domain risk classification.

Formula
BIR_window = Σ_d log₁₀(estimated_traffic_d + 1)

Log-scaled sum of estimated traffic across detected impersonator surfaces.

Inputs and source weights
Certstream / WHOIS feeds (domain registration)
high
Live monitoring of typosquat candidates
Traffic estimation (DNS resolution + observed referrer signals)
high
Audience reach proxy
Social platform public APIs (impostor account detection)
medium
Follower count, recency, profile content
ASN clustering (mirror-operator correlation)
medium
Detects single-operator mirror networks
Aggregation

Operator-cluster detection: typosquats sharing a hosting ASN within a window are correlated and reported as a single operator surface.

Caveats
  • ·Traffic estimates carry meaningful uncertainty (±35% typical). We always report the range, not a point.
  • ·Social platforms vary in API access; some have lower observability — we declare per-platform coverage in the methodology change log.
  • ·Defensive domain registration recommendations are advisory; the legal go-decision rests with the client.
03·Source whitelist

Enumerated. Contractual.

The sources we collect from are part of the contract. Adding a source requires a methodology version bump and a notice to existing clients. Removing a source likewise. Source weights are reviewed quarterly.

Licensed dark-web data partners
CES+IMI+RPS · high
DarkOwl, SpyCloud, Constella, Flare under contract
Public stealer-log markets
CES+IMI · medium-high
Observed via archive mirrors; no purchases
Public ransomware leak sites
RPS+BLV · high
Tor mirror observation; victim postings, data-claim metadata
Public paste sites + forum mirrors
BLV+IMI · medium-high
Pastebin-class, archive.org-indexed boards
Public Telegram channels
IMI+BLV · medium
MTProto public-channel API; no group infiltration
Certstream + WHOIS
BIR · high
Domain-registration monitoring, typosquat detection
Public incident disclosures
RPS · medium
SEC 8-K, regulator filings, news; cohort-clustering input
DNS / passive-DNS / ASN feeds
BIR · medium
Infrastructure-side correlation, mirror-operator clustering
04·Identifier handling

Hashed at ingest. Dereferencing requires verified ownership.

The minimum-necessary principle is enforced at the collector boundary. Plaintext identifiers are normalized, hashed with SHA-256 + a salt rotated every 90 days, and discarded. Only the hash + metadata (source, observation time, severity flags) is persisted.

What we store

  • +SHA-256(identifier + current_salt)
  • +Source family + observation timestamp
  • +Severity flags (active session, captured cookies, etc.)
  • +Hash-of-hash for cross-client peer benchmarks

What we never store

  • ×Plaintext emails, usernames, or passwords
  • ×Payment-card fragments (dropped at parser)
  • ×Personally-identifying content from leaked documents
  • ×Any signal touching CSAM (see red lines)

Verified-ownership dereferencing — for instance, a CISO needing to notify affected employees — is bound to a documented lawful basis (GDPR Art. 6(1)(f) + 34) and is logged for audit. The dereferencing endpoint is rate-limited and tenant-scoped.

05·Peer-cohort anonymization

N ≥ 5. No exceptions.

Peer benchmarks (the P<value> percentile shown in client dashboards) are computed across a bucket of firms with comparable characteristics: sector, workforce-size band, and revenue band. A benchmark is suppressed if the bucket size falls below five firms.

06·Confidence intervals

80% bands. Bayesian shrinkage on small-N.

Every index output ships with an 80% confidence interval. We chose 80% (not the more familiar 95%) deliberately — at 95%, intervals on small observation windows become uninformative wide bands that disguise real signal. 80% balances calibration with usability.

For clients with thin observation history (< 14 days) or narrow attack surfaces, we apply Bayesian shrinkage toward the peer-cohort prior. The prior is the cohort median with conservative weighting; the shrinkage weight is published alongside each affected output. As your observed data accumulates, the shrinkage decays automatically.

Reported as
RPS = 7.2 [6.7 – 7.6]   ← 80% CI
       ↑    ↑     ↑
   central lower upper
07·Versioning & change log

Every output carries the version it was computed with.

Methodology versions are immutable once shipped. Historical index values are recomputed-only on explicit request and the new values carry both the original version and the recompute version. Clients are notified at least 14 days before any minor version bump and 30 days before any major bump.

v0.4
2026-05
Added supplier-graph overlap signal to RPS. Refined β weighting for CES high-severity records. Cross-listing identity attribution now reported with confidence bands.
v0.3
2026-03
Introduced peer-cohort benchmarks (N ≥ 5). Added 80% CI reporting standard across all five indices. Cross-vendor deduplication agreement check formalized.
v0.2
2026-01
Brand-leak fingerprinting added (document hashing + brand-string detection). Hash-rotation salt schedule formalized (90-day rotation).
v0.1
2025-11
Initial release with five indices: CES, IMI, RPS, BLV, BIR. Methodology baseline.
08·Red lines

What this methodology will never compute.

The collection boundary is part of the math. Outputs are only as trustworthy as the inputs that produced them — these rules bound the inputs.

Questions on the math?

We respond to methodology questions in writing.

CISO, head of risk, or anyone on the legal side — send the specific question and we'll respond inside 48h. First-cohort tier conversations get a longer-form methodology brief on request.

Request a methodology session
Independent audit

Built to be re-computed by a third party.

Every index output is anchored to (a) the methodology version, (b) the inputs at the time of computation, and (c) a SHA-256 hash of the evidence bundle. An auditor with read access can independently replay any historical value. Methodology audits are part of the first-cohort onboarding.