Back to Home

How the library
is built

Verasight source reports are the canonical record. The Data Library presents key findings from them.

[34]
Featured topics
[204]
Indexed questions

Source of truth Verasight.io

Source of truth

verasight.io/reports

Verasight source reports are the canonical record. Each one carries the full survey instrument, weighting design, and field methodology in one place.

The Data Library presents key findings from those reports. Every featured topic and indexed question on this site cites back to the source. To see how the full methodology works for any wave, click through to its report.

Open Verasight reports

01 / Vocabulary

The editorial primitives.

Every page in this library resolves to one of four editorial primitives. Categories group, featured topics tell, indexed questions reference, supporting data justifies.

Category
The top-level grouping. The first public set is AI & Tech, Money, Politics, Health, Culture. Categories are an editorial taxonomy on top of the data, not a Verasight module name.
Featured topic
A curated narrative built from multiple supporting questions in one or more waves. Featured topics are the main editorial unit on site.
Indexed question
A canonical question that is not absorbed into a published featured topic. It stays browseable and searchable on its own page, lighter in treatment than a featured topic.
Supporting data
Underlying questions, crosstabs, methodology notes, and source-report links that justify a featured topic's story, charts, and layout.

02 / Pipeline

Six stages from raw wave to canonical library.

The pipeline runs as a fixed sequence of inspectable stages. Each stage owns one job and writes a checkable artifact. A failure halts the run with a descriptive error, not silent drift.

  1. 01

    Ingest

    Raw Verasight files (.sav, .csv, .dta, .rds, .xlsx) become long-format intermediate tables with metadata sidecars.

  2. 02

    Normalize

    Null handling, PII scrub, demographic normalization, and respondent context propagation.

  3. 03

    Weight

    Row-level weight validation and reusable weighted summaries.

  4. 04

    Crosstab

    Banner and extra demographic breakdowns with low-N flags and canonical dimension names.

  5. 05

    Enrich

    Questions package into a wave-scoped bundle with toplines, crosstabs, methodology, citation, and slug.

  6. 06

    Emit

    Canonical question JSON, long-format per-question CSV, per-wave summary, and a site-wide index.

The pipeline run ends at stage 06 emit. Editorial curation happens after, on the committed canonical artifacts, and always passes through human review before publication.

03 / Publication

How canonical data becomes a published page.

Canonical data is regenerable. Editorial decisions are committed as artifacts. The site is the read view over both.

  1. 01

    Canonical layer

    One JSON and one CSV per question, plus a per-wave summary and a site-wide index. Stable contract, regenerable from raw inputs.

  2. 02

    Curation layer

    Category mappings, topic proposals, and curated featured topics commit on top of the canonical layer. Editorial decisions are tracked as artifacts, not implicit.

  3. 03

    Publication layer

    The site reads committed artifacts. Published featured topics drive home, category, and search surfaces. Absorbed questions never duplicate as standalone pages.

A canonical question absorbed into a published featured topic does not duplicate as a standalone page. Indexed-question pages are exactly the canonical questions that are not absorbed.

04 / Sources

The canonical source is the report.

The Data Library reads, presents, and cites. The full source is upstream at Verasight. Every featured topic links back to the report and, where available, to the question anchor inside it.

Canonical destination
The full underlying source report at verasight.io/reports. The Data Library is a key findings library, not a replacement for the source.
Question-level citation
Each canonical question links to its question anchor in the source report when available, and falls back to the report root when not.
Raw data on site
V1 does not surface raw downloads on site. Per-question CSVs are emitted by the pipeline and remain a future on-site surface.
Transparency
Methodology is presented under AAPOR transparency standards, with field methodology and weighting attributed to Verasight upstream.

05 / Citations

Built to be cited accurately by machine and human.

Accurate citation requires structure on the page, structure in the source, and a stable path back to canonical. The library publishes all three.

llms.txt
Root-level index intended for LLM crawlers.
.md mirrors
Every page has a markdown mirror for clean LLM citation.
JSON-LD
Dataset and FAQPage schema on every relevant page.
robots.txt
Allow-all. The site actively wants to be cited.

06 / Upstream

Polling methodology lives upstream at Verasight.

Field methodology, weighting design, and verbatim instrument text are authored by Verasight and cited from the source report. The Data Library carries enough methodology inline to read a topic, and points to the source for everything else.

On a topic page
Mode, population, field dates, base size, margin of error, weight variable, and module name appear inline with each featured topic.
In the source report
Full field methodology, exact weighting variables and targets, sponsor and design context, and verbatim instrument text live at verasight.io/reports.
In the canonical JSON
Per-question methodology, citation, and crosstab metadata are persisted in the same shape that drives the page.

Mode-by-mode field methodology, weighting variable lists, and per-wave demographic summaries appear in the underlying report and in the per-question canonical record. Site pages surface a stable subset.

Citations

Source

When in doubt, cite the canonical source report.

Open Verasight reports

Verasight survey methodology

How Verasight conducts surveys.

This page describes the Verasight general survey contract, separate from how the Data Library packages it. Each wave's specific field dates, sample sizes, and module breakdown are listed in that wave's report.

Mode
Verasight panel recruited via random address-based sampling, random person-to-person text messaging, and dynamic online targeting.
Population
US adults age 18+.
Sample design
Surveys are run as omnibus or single-topic waves. Omnibus waves are split into modules with their own respondent set, typically around one thousand respondents per module.
Field window
Each wave specifies its own field dates. Most omnibus waves field across roughly two weeks.
Weighting
Per-module weighting to CPS targets including age, race and ethnicity, sex, income, education, region, and metropolitan status.
Partisanship benchmark
Pew Research Center's NPORS benchmarking surveys, three-year running average.
Vote benchmark
2024 presidential vote population benchmarks.
Margin of error
Typically about plus or minus 3.4 to 3.6 percent per module at standard module sizes. Question-level MoE is recomputed when a base shrinks materially below the module baseline.
Reporting
Every wave is published as a standalone report at verasight.io/reports with full instrument and methodology.
Transparency
AAPOR transparency standards.

Wave-specific methodology, full weighting variable lists, and verbatim instrument text live in each report at verasight.io/reports.