Research

Citation Value Score Methodology 2026

The public methodology for Citation Value Score: four signals, weights, validation against marketplace pricing data, and the engineering details that make CVS reproducible.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: April 2026

Why a Methodology Paper Matters

Pricing AI content without a defensible methodology means pricing on guesses. The marketplaces (Cloudflare Pay-Per-Crawl, TollBit, ProRata, ScalePost) each price differently, the bilateral licensing deals are mostly opaque, and individual publishers have no way to anchor their prices in something other than competitor-pacing. Citation Value Score (CVS) is a methodology designed to give all participants a shared, reproducible value figure that pricing can be calibrated against.

This page is the public methodology paper for CVS as of April 2026. It describes the four signals, the weights, the validation procedure, and the calibration choices. It is intentionally enough to allow third-party reproduction of approximate scores without disclosing the exact weights and proprietary calibration we use internally for paying customers.

The Four Signals

Crawl activity (25%). Frequency and bot-composition of AI crawler fetches against the cited page during a defined observation window. Captured from server logs, CDN logs, or Cloudflare Worker plus D1 logging. Decomposed by major bot identity (GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, Applebot-Extended, Meta-ExternalAgent, plus the long tail).

Content quality (35%). Structural and editorial properties that influence whether AI assistants ground answers in this content. Schema completeness (Schema.org coverage and validity), citation density (number of references per 1000 words), primary-research indicator (does the page contain original data, methodology, or analysis), freshness (lastmod and content-revision recency).

Authority (25%). Domain-authority and entity-graph proxies. Wikipedia presence and article quality, primary editorial coverage in established outlets, citation-graph centrality from other authoritative sources, brand entity-linkage strength in major knowledge graphs.

Outcomes (15%). Downstream behaviour signal: whether observed citations actually drove user clicks, sessions, or conversions. Captured where attribution is observable (UTM-tagged citation links, server-log referer patterns from AI assistants, customer-side conversion data with permission).

How the Weights Were Set

Initial weights were anchored against marketplace pricing data. We collected per-fetch and per-citation prices from observable Cloudflare Pay-Per-Crawl, TollBit, ProRata, and ScalePost transactions where the publisher and content were also tracked in Presenc AI's monitoring. The weights were optimised to maximise correlation between CVS composite scores and observed market prices for those publisher-content pairs. The optimisation produced the four-signal weighting reported above with content quality dominating at 35%, crawl and authority tying at 25%, and outcomes at 15%.

The 15% weight on outcomes is intentionally lower than its conceptual importance because outcomes attribution is the noisiest of the four signals. As attribution infrastructure matures (especially through ERC-8004 and AP2-style structured payment-and-citation records), the outcomes weight will likely rise. The methodology version reported here is locked for the April 2026 quarterly cycle.

Validation

CVS is validated against marketplace pricing on a rolling basis. The current validation correlation (Spearman rank correlation between CVS scores and observed marketplace per-citation prices, across the publisher-content sample where both are available) is approximately r = 0.78. The target for r is 0.80 by Q4 2026. The gap between current and target is mostly driven by outcomes-attribution noise; as attribution infrastructure improves, the gap closes.

The methodology is also cross-validated against bilateral licensing implied per-citation rates where those are publicly disclosed or shared under NDA. The bilateral correlation is lower (around r = 0.55) because bilateral deals carry large fixed-fee components that dilute the per-citation signal. CVS is calibrated to marketplace prices, not bilateral implied rates, because marketplace prices are the more directly relevant comparator for non-mega-publisher content.

What CVS Is Not

CVS is not a marketplace price. It is the value figure against which marketplace prices can be evaluated. CVS is also not a domain-authority score. Domain authority is one of the inputs to the authority signal, but CVS's composite is multidimensional in ways domain authority is not.

CVS is also not deterministic. The same page in the same observation window can produce slightly different CVS scores under different observation samples because the underlying signals are sampled (crawl events from a sample of logs, citations from a sample of probes, outcomes from a sample of attributable sessions). The reported CVS is a point estimate; the methodology supports producing confidence intervals on demand for use cases where uncertainty quantification matters.

How Publishers and Brands Use CVS

Publishers use CVS to anchor pricing decisions in the marketplaces, to evaluate which content tiers are over- or under-priced relative to their value, and to negotiate bilateral licensing terms with defensible per-citation figures. Brands use CVS to track whether their content investments translate into citation value, decomposed by the four signals so that the diagnostic shows where the gap is. AI labs use CVS as a procurement input when evaluating whether the prices they pay for content are commensurate with the value received.

Versioning and Updates

The methodology is versioned quarterly. The current version is CVS-2026Q2, locked April 30, 2026. Previous versions and changelogs are available on request. Major changes (new signal, weight rebalance, validation procedure update) are pre-announced 30 days before release to allow publishers and brands to recalibrate dependent pricing.

Frequently Asked Questions

It is the signal that correlates most strongly with observed marketplace pricing in the validation sample. AI labs and marketplaces pay more for content that is structurally well-organised and editorially substantial because that content grounds answers more reliably. Crawl activity and authority matter, but they matter as inputs to whether the AI lab considers the content worth ingesting in the first place; content quality is what determines its post-ingestion value.
They are converging but will not become identical. Marketplace prices reflect supply-and-demand dynamics within the marketplace, which add noise on top of intrinsic content value. CVS is calibrated to marketplace prices but is intended to be more stable across short-term marketplace volatility.
The methodology is publishable enough to allow approximate self-implementation. The proprietary calibration weights and validation data are not public, so a third-party implementation will produce similar but not identical scores. Publishers serious about CVS-anchored pricing typically use Presenc AI for the canonical implementation.
Quarterly. Each release is pre-announced 30 days before the new methodology goes live to allow publishers and brands to recalibrate. The intent is to stabilise the methodology over time as the underlying market matures, with major weight changes becoming rarer past 2027.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.