Evaluation

Evaluation v0: ranked feed vs simple baselines

Under-cited familyDistributional checks only

This page calls GET /api/v1/evaluation/compare so you can inspect the same candidate pool under three orderings: materialized ranking, citation-sorted, and date-sorted. Nothing here measures whether researchers would find the papers useful; it only shows distributional checks on short lists.

Pool size

121

Run label

shadow-generalization-product-candidate-ranking-v1

Embedding

shadow-generalization-text-embedding-v1

Generated at

2026-06-26

Run label filter: shadow-generalization-product-candidate-ranking-v1

Focus paper: https://openalex.org/W4404639702 | compare limit 12 Not visible in the current compare window. The run context is still pinned while you switch families.

Interpretation guardrails

Interpretation notes

These outputs help compare ranking behavior and expose drift; they are not expert-reviewed evidence that papers are useful to researchers.

  • Side-by-side lists share the same candidate pool for the selected recommendation family and corpus snapshot.
  • Recency, citation, and topic summaries are coarse proxies over the short lists shown; they do not measure whether a researcher would find a paper useful.
  • Topic overlap uses Jaccard similarity on topic labels attached to papers in this corpus, not semantic similarity of full text.
  • Use ranked outputs for product behavior; use this endpoint to sanity-check drift against naive orderings.

Run context

Run and pool

Run shadow-generalization-product-candidate-ranking-v1 | rank-83787b91ef | snapshot source-snapshot-shadow-generalization-v1-20260521 | embedding shadow-generalization-text-embedding-v1 | pool size 121

Low-cite candidate pool (revision v0): included core works in this corpus snapshot, year≥2019, citations≤30, non-empty title and abstract. Matches docs/candidate-pool-low-cite.md and the materialized undercited family scope.

Low-cite gate from run config: year at least 2019, citations at most30 (revision v0).

Topic labels are imported metadata and can be noisy; use them as coarse navigation hints, not authoritative classifications.

List overlap

Topic label overlap between lists

Jaccard index on the set of OpenAlex topic labels appearing in the top tags of each paper in the list. High overlap means similar topic mix, not similar intellectual content.

Ranked vs citation baseline
0.2258
Ranked vs date baseline
0.2121
Citation vs date baseline
0.2000

Ranked (family)

List size 12

Focus paper: https://openalex.org/W4404639702 is not visible in this arm.

Materialized ranking run: order by final_score descending, then work_id (stable tie-break). Blend and signals follow this run's persisted family_weights and paper_scores (semantic may be used for Emerging when configured).

Order: final_score DESC, work_id ASC

Mean year

2025.2

Median cites

0.5

Unique topics

24

Proxy stats (list-only; not relevance)

Recency
mean year 2025.17; min-max 2025-2026; share in latest two years 100.0%
Citations
mean 1.08; median 0.50; range 0-6
Topic mix
24 unique labels in list; top: Music Technology and Sound Studies, Advanced Vision and Imaging, Image and Video Stabilization, Matrix Theory and Algorithms, Authorship Attribution and Profiling

Citation baseline

List size 12

Focus paper: https://openalex.org/W4404639702 is not visible in this arm.

Popularity-style baseline on the same pool: highest citations first (not a relevance judgment).

Order: citation_count DESC, year DESC, openalex_id ASC

Mean year

2024.6

Median cites

5.0

Unique topics

14

Proxy stats (list-only; not relevance)

Recency
mean year 2024.58; min-max 2024-2025; share in latest two years 100.0%
Citations
mean 5.67; median 5.00; range 3-11
Topic mix
14 unique labels in list; top: Music Technology and Sound Studies, Music and Audio Processing, Speech and Audio Processing, Diverse Musicological Studies, Hearing Loss and Rehabilitation

Date baseline

List size 12

Focus paper: https://openalex.org/W4404639702 is not visible in this arm.

Pure recency baseline on the same pool: newest year first (not a relevance judgment).

Order: year DESC, openalex_id ASC

Mean year

2026.0

Median cites

0.0

Unique topics

16

Proxy stats (list-only; not relevance)

Recency
mean year 2026.00; min-max 2026-2026; share in latest two years 100.0%
Citations
mean 0.00; median 0.00; range 0-0
Topic mix
16 unique labels in list; top: Music and Audio Processing, Music Technology and Sound Studies, Hearing Loss and Rehabilitation, Multisensory perception and integration, Tactile and Sensory Interactions

Generated at 2026-06-26T21:42:07.989207Z. This page shows citation and date baselines plus distributional checks on short lists. For roadmap-style framing, see /api/v1/evaluation/summary.