Schemas
Purpose
Schemas define the public JSON structure used by artifact manifests and benchmark evidence records. They are intended for indexing, citation tooling, reproducible audits, and external readers that need stable field names.
Available Schemas
| 1 | Paper manifest v1. JSON Schema for per-paper artifact manifests, including work metadata, public artifacts, quality assessment, provenance, and limitations. |
| 2 | Benchmark evidence v1. JSON Schema for model-benchmark score evidence clusters, including reported ranges, spread, severity, source coverage, source profile, entries, interpretation, and limitations. |
| 3 | Quality dimensions v1. JSON Schema for multidimensional quality assessment records, including evidence strength, citation grounding, numerical consistency, benchmark validity, novelty, uncertainty disclosure, reproducibility status, and safety/impact. |
| 4 | Quality tiers v1. JSON Schema for descriptive public quality tier records, including tier, basis, inputs, thresholds, enforcement note, and limitations. |
| 5 | Review monitor v1. JSON Schema for public re-review trigger records, including monitor state, active triggers, monitored triggers, enforcement note, and limitations. |
| 6 | Verification levels v1. JSON Schema for paper-level public verification status, including level, basis, claim summary, unsupported upper levels, enforcement note, and limitations. |
| 7 | Math result v1. JSON Schema for public mathematical result manifests, including verification state, computational budget summary, proof-source availability, artifacts, interpretation, and limitations. |
| 8 | Source provenance v1. JSON Schema for public paper provenance records, including artifact inspectability, external archival record state, aggregate claim lineage, source-list boundary, interpretation, and limitations. |
| 9 | Correction record v1. JSON Schema for public correction and manifest-version records, including correction state, additive correction events, superseded manifest references, policy notes, interpretation, and limitations. |
| 10 | API envelope v1. JSON Schema for additive API envelope responses, including contract metadata, query echo, pagination, links, records, and compatibility notes. |
| 11 | Status record v1. JSON Schema for the public status surface, including public counts, contract versions, public routes, data change policy, verification policy, and limitations. |
| 12 | Public export v1. JSON Schema for public dataset export records, including dataset identifier, media type, download URL, fields, limits, update policy, privacy boundary, and limitations. |
Quality Dimension Contract
Quality dimensions separate public assessment into inspectable signals instead of relying on a single aggregate score. Dimension fields are designed for triage and auditability; they do not replace source papers, independent replication, or domain expert review.
Paper manifests may include a quality_dimensions object derived
conservatively from public manifest fields. Dimensions marked
UNASSESSED mean no public dimension-specific scorer is attached;
they are not negative findings.
Quality Tier Contract
Paper manifests include a quality_tier object that maps public
aggregate signals into one of five descriptive inspection bands:
FLAGSHIP_CANDIDATE, DOI_GRADE,
PUBLIC_RECORD, WATCHLIST, or
QUARANTINE_CANDIDATE.
Quality tiers are generated from public manifest fields such as review score, verified claim count, artifact types, and reproducibility level. They are descriptive public triage labels and do not alter current publication or DOI behavior.
Review Monitor Contract
Paper manifests include a review_monitor object that records
public re-review trigger state. The current public states are
NO_PUBLIC_TRIGGER and REVIEW_RECOMMENDED.
The monitor contract defines trigger names for benchmark contradictions, source
availability, extraction-signal changes, corrected manifests, and low quality
tiers. Triggers marked NOT_EVALUATED_PUBLICLY are not public
findings; they indicate that no public trigger signal is attached to the
manifest.
Verification Level Contract
Paper manifests include a verification_status object that reports
the highest paper-level public evidence status currently supported by manifest
fields. The contract uses levels L0 through L7.
The public manifest does not claim sandbox execution, independent reproduction, external review, or formal verification unless supporting public artifacts or event records are attached. Unsupported upper levels are listed explicitly so readers do not infer stronger evidence than the manifest provides.
The claim_summary field exposes aggregate public counts such as
source_grounded_claims, aggregate_claim_records, and
per_claim_public_record_count. Current public manifests do not
expose individual claim text.
Source Provenance Contract
Paper manifests include a source_provenance object using the
source-provenance-v1 contract. The contract separates public
metadata, generated report artifacts, aggregate claim-lineage records, and
external archival records into explicit provenance levels.
The current public paper provenance contract intentionally exposes aggregate source-grounding signals, artifact availability, and DOI or external-record state. It does not expose private extraction traces, local file paths, individual claim text, or a full cited-source list unless such a list is attached as a public artifact in a future contract.
Correction Record Contract
Paper manifests include a correction_record object using the
correction-record-v1 contract. The public states are
CURRENT, CORRECTED, and SUPERSEDED.
The contract also exposes manifest contract_version and
contract_updated fields so clients can detect public contract
drift without scraping HTML.
Corrections are additive public records. A work marked CURRENT
has no public correction event attached; this is not a claim that the synthesis
is error-free. Private review notes, local storage paths, and operational logs
are intentionally omitted from correction records.
API Envelope Contract
The api-envelope-v1 contract provides stable machine-readable
metadata for newer public API surfaces. Envelope responses include
contract_version, contract_updated, query echo,
offset pagination, navigation links, and a records array.
Legacy list endpoints such as /api/v1/papers keep their existing
list response shape for compatibility. New clients should prefer
/api/v1/papers/envelope and can read
/api/v1/api-contract for endpoint-level compatibility metadata.
Status Record Contract
The status-record-v1 contract powers /status and
/api/v1/status. It reports public data counts, public contract
versions, public routes, and change policy without exposing private
infrastructure or operational logs.
Public Export Contract
The public-export-v1 contract powers /exports,
/api/v1/exports, and public JSONL dataset downloads such as
/api/v1/exports/papers.jsonl. Export records are public metadata
records suitable for indexing, citation tooling, and reproducible audits.
Exports are additive public data surfaces. They do not expose private operational logs, local storage paths, credentials, or internal worker state.
Math Result Contract
Mathematical result manifests use the math-result-v1 contract.
The contract separates FALSIFIED,
COMPUTATIONAL_EVIDENCE, FORMAL_PROOF_ATTEMPTED, and
FORMAL_PROOF_VERIFIED.
Computational evidence is not proof. Public Lean4 source is linked only when the manifest can support a formal verification claim. Python check code, local file paths, and private execution logs are intentionally omitted from public math manifests.
Benchmark Evidence Contract
Benchmark evidence records are audit surfaces, not verdicts. The
source_coverage field reports how many extracted records and
distinct public sources support the cluster. The source_profile
field reports public source URL coverage, source domains, and available
publication-year bounds. These fields describe source breadth and provenance;
they do not assert correctness or independent replication.
The benchmark evidence index API at /api/v1/benchmark-evidence
supports additive public filters: limit, min_spread,
severity, and coverage or
coverage_level. It also supports public source filters:
source_domain, year_min, and year_max.
Year filters match clusters whose public source-year range overlaps the
requested bounds. Filters operate only on public evidence-cluster metadata and
preserve the default unfiltered response shape.
Machine clients can read /api/v1/benchmark-evidence/meta for the
current public filter contract, allowed enum values, response fields, and
example query URLs. Benchmark evidence JSON and metadata expose
contract_version and contract_updated so clients can
detect public contract drift without scraping HTML.
Stability
Versioned schema URLs are stable public contracts. Additive fields may appear in future records, but incompatible structure changes should use a new versioned schema URL.
Privacy Boundary
Public schemas describe only public artifact fields. They do not expose private infrastructure, local storage paths, non-public operational metadata, or private credentials.