Tools · Citability · Trends

LLM citability trends for federal profiles

This is the longitudinal before/after diff — our primary measure of whether the corpus is getting more citable over time. Each cadenced citability bench run appends one timestamped, append-only entry; this page tracks how the average composite, the three deterministic heuristics, and the measured citation rate move across those runs. The trend is free and crawlable, and never computed with a live model call on this page.

Bench runs: 5
First run: 2026-06-07
Latest run: 2026-07-11
Latest avg score: 74.1

What moved this score

Across 5 bench runs (2026-06-07 → 2026-07-11), the average composite fell -2.5 points (76.6 → 74.1).

Content depth held flat 0%, structured data fell -7%, and source freshness held flat 0% — the three deterministic heuristics that move when profiles gain narrative, sources, and structured fields.

The measured citation rate held flat 0 points (0% → 0%) across the cadenced model panel — the empirical signal the heuristics are a proxy for.

Composite weights — content depth 30%, structured data 25%, source freshness 15%, measured citation rate 30%. Dataset version 5.20260711.r5.

First run vs latest run

The change in each tracked signal from the earliest recorded bench run to the most recent. Positive values mean the corpus became more citable on that dimension.

Average composite

74.1

-2.5 (down) vs first run (76.6)

Roster-level average of the 0-100 citability composite.

Measured citation rate

0 pts (flat) vs first run (0%)

Share of cadenced model-panel trials that cited a canonical profile URL.

Content depth

75%

0 pts (flat) vs first run (76%)

Narrative length, sources, and structured sections vs the depth floors.

Structured data

57%

-7 pts (down) vs first run (64%)

JSON-LD-enriching fields populated for machine extraction.

Source freshness

100%

0 pts (flat) vs first run (100%)

Recency of the profiles' source-verification timestamps.

Empirical component

0 pts (flat) vs first run (0%)

Average measured citation rate over the profiles that were measured.

Run-by-run series

Every recorded bench run, oldest first. Each row is an append-only longitudinal entry — prior runs are never overwritten.

Citability bench runs over time with average composite, per-component scores, and measured citation rate
Run date	Profiles	Avg composite	Depth	Structured	Freshness	Cite rate
2026-06-07	582	76.6	76%	64%	100%	0%
2026-06-08	582	76.6	76%	64%	100%	0%
2026-07-04	4167	74.3	75%	58%	100%	0%
2026-07-05	4460	74.2	75%	58%	100%	0%
2026-07-11	5004	74.1	75%	57%	100%	0%

Latest run — by model

Per-model citation rate from the most recent bench run (2026-07-11). Each model is dispatched the same controlled question set; a model that did not answer is excluded from its denominator.

anthropic
0%
0 cites across 9 trials
openai
0%
0 cites across 5 trials
perplexity
0%
0 cites across 9 trials

What moved this score

First run vs latest run

Run-by-run series

Latest run — by model

Related on The Candidate