Marks must be boring, auditable, and reproducible
The full recipe — scope, inputs, calculation, intervals, governance, change log, and limitations — versioned and replayable. Current version v0.2.0.
Scope
Spearhead publishes reference marks and conformal confidence intervals for late-stage private companies. A mark is an indicative valuation estimate, not a transactable price, an offer, or advice. The current surface is an illustrative backtest derived exclusively from publicly reported valuation events; no live marks are published.
Spearhead is intentionally infrastructure-agnostic: it is a valuation evidence product that does not provide execution, custody services, or recommendation-led outputs.
Eligibility & coverage
A company enters the universe when it has at least one publicly reported, source-attributable valuation event. Companies leave the marked universe at their IPO: the mark series freezes the day before the debut and the name moves to the validation set.
Absence of events never removes a company; it freezes the mark and widens the interval through the staleness rule.
Input hierarchy
Inputs are publicly reported valuation events, each typed and quality-weighted. IPO prints are held out as validation outcomes and are never mark inputs. Terminated acquisition offers appear as chart markers only.
| Rank | Source class | Base quality weight |
|---|---|---|
| 1 | funding_round | 0.90 |
| 2 | tender_offer | 0.80 |
| 3 | secondary_market | 0.65 |
| 4 | venue_mark | 0.55 |
Calculation recipe
The point mark at any date is a quality- and recency-weighted blend of all mark-input events on or before that date, with weight = quality x 0.5^(age / half-life). Because every weight decays at the same half-life, the mark is an exact step function: it moves only when a new public event lands. No drift is invented between observations and no future information is used.
The recency half-life is the model's only fitted constant. It is selected from a fixed candidate set by pooled walk-forward error and disclosed, with the full candidate table, on the scorecard.
Quality- and recency-weighted blend of public observations with a 90-day half-life; an exact step function between events (no invented drift, no lookahead). Staleness never moves the point; it widens the interval and lowers the confidence score.
Conformal interval method
Intervals come from split conformal prediction on walk-forward residuals: for each historical event we predict its value using strictly earlier events and record the staleness-normalized absolute log residual. The published band is mark x exp(+/- q x w), where q is the finite-sample conformal quantile at the stated coverage and w widens with staleness (w = 1 + 0.25 x staleness_days / 365).
Under stability assumptions the band covers with at least the stated probability. Pooling residuals across companies weakens that guarantee, so empirical coverage is published quarterly instead of leaning on the theorem. When the sample is too small to support a level (k > n), the payload flags insufficient_sample rather than silently claiming it.
Split conformal prediction on staleness-normalized absolute percentage residuals from walk-forward replay of the private universe; finite-sample quantile ceil((n+1)*coverage)/n; bands are mark * (1 +/- q * width); insufficient sample sizes are flagged, never hidden.
The stated probability that the next public observation falls inside the 80% band; scored quarterly with Brier and reliability diagrams against leave-one-out outcomes.
Governance & review
Marks publish only through the pipeline: every change appends a mark event (created / revised / deprecated / challenged / accepted) with its reason, actor type, and methodology version. Nothing overwrites history. Scorecards pass a gated publish queue (citation coverage, deterministic regeneration, coverage tolerance, sample floor) with a named reviewer before they ship.
Model output never publishes marks, runs destructive jobs, alters legal text, bypasses the review queue, or exports client data.
Limitations
- Coverage is limited to companies with publicly reported valuation events; absence of events freezes the mark and widens the band rather than inventing movement.
- IPO prints and terminated acquisition offers are never mark inputs; the former are held-out validation outcomes, the latter chart markers only.
- Calibration pools residuals across companies and time; the conformal guarantee weakens accordingly and empirical coverage is published instead.
- Reported valuations inherit the share-count basis of the cited articles; bases differ across outlets and are recorded in event notes.
- This demo universe is an illustrative backtest, not investment advice and not a live pricing service.
Defined terms
- Mark — an indicative point estimate of company value derived from public events; never a transactable price.
- Interval / band — the conformal range around the mark at a stated coverage level (80% or 95%).
- Conformal quantile (q) — the finite-sample quantile of walk-forward residuals that sizes the band.
- Staleness — days since the most recent mark input; widens the band and lowers confidence, never moves the mark.
- Confidence — the model's stated probability that the next public observation falls inside the 80% band; Brier-scored quarterly.
- Walk-forward residual — the error from predicting a historical event using only strictly earlier events.
- Provenance: backtest — a replay of public events; the label every figure on this surface carries today.
- Provenance: live — marks from the production pipeline with append-only events and review-queue publication; none are published yet.
Change log
| Version | Date | Summary |
|---|---|---|
| v0.1.0 | 2026-05-20 | Initial weighted-mean mark with fixed +/-8% bounds and heuristic confidence. |
| v0.2.0 | 2026-06-11 | Split conformal multiplicative intervals (80/95) on walk-forward residuals; recency half-life selected by disclosed walk-forward error; staleness-widened bands; Brier-scored confidence; held-out IPO validation with published misses. |
Live vs backtested
Every figure on these surfaces is a backtest replay of publicly reported events and carries provenance=backtest. A backtest must never pose as a live mark: live marks (provenance=live) ship only from the production pipeline, with append-only mark events and review-queue publication.