Methodology
Purpose
Assignee Research publishes machine-assisted literature syntheses, benchmark intelligence, and selected machine-verifiable research artifacts. The public record is designed to be inspectable: each output should make clear what was claimed, where the claim came from, how it was assessed, and what limitations remain.
The literature synthesis pipeline does not run new experiments or create new datasets. Its reports summarize and compare existing published work. Mathematical reports are presented separately because their status depends on computational search or formal verification rather than literature synthesis.
Literature Synthesis Workflow
| 1 | Question selection. A specific research question is selected from the active research agenda. Broad or redundant questions are de-prioritized. |
| 2 | Source retrieval. The system searches public scientific indexes and collects candidate papers relevant to the question. |
| 3 | Reading and extraction. When full text is available, the system uses sections and tables in addition to abstracts. Candidate claims and benchmark scores are extracted from source material. |
| 4 | Grounding check. Claims are compared against retrieved sources. Claims that cannot be grounded are excluded from publication or treated as lower-confidence material. |
| 5 | Quality assessment. Drafts are evaluated for source coverage, consistency, evidence strength, uncertainty disclosure, and publication suitability. |
| 6 | Artifact generation. Approved reports are rendered as public pages, citation exports, and when available, PDF artifacts and external records. |
Quality Assessment
Quality scores are automated assessments of source grounding, internal consistency, and evidence coverage. They are not human peer review and should not be read as endorsement by domain experts. A high score indicates that a report passed the system's internal checks more strongly than a lower-scored report.
The score is useful for triage, but the source papers remain authoritative. Readers should cite original sources for specific factual claims and cite Assignee Research only when referring to the synthesis, comparison, or benchmark audit itself.
Benchmark Discrepancy Detection
Benchmark pages aggregate model-performance claims extracted from papers. A discrepancy is flagged when different sources report divergent scores for the same model and benchmark. A spread of at least three percentage points is treated as notable; larger spreads are given higher severity.
Discrepancies are not automatically accusations of error. They may arise from different prompts, dataset versions, evaluation protocols, scoring rules, preprocessing, fine-tuning, or reporting conventions. The purpose of the tracker is to make ambiguity visible and auditable.
Mathematical Results
Mathematical outputs are separated from literature syntheses. A formal proof is treated differently from computational evidence. Computational evidence means that a search found no counterexample within the stated conditions; it is not a proof. Formal claims require machine-checkable proof artifacts before they are presented as proven.
Limitations
| L-1 | Automated inference can be wrong. Retrieval, extraction, scoring, and summarization can miss sources, misread tables, or overstate agreement. |
| L-2 | Coverage is incomplete. The system can only reason over material it retrieves and parses successfully. |
| L-3 | Benchmark comparability is fragile. Scores with the same benchmark name may still differ because the underlying protocol differs. |
| L-4 | Public pages are summaries. The source papers and cited artifacts should be consulted before relying on any scientific conclusion. |
Corrections
Correction requests should include the affected URL, the exact claim or score, the reason it appears wrong, and a source that supports the correction. Send requests to contact@assignee.net. Substantive corrections may result in updated public text, hidden or withdrawn entries, changed discrepancy status, or a note in the methodology changelog.
Methodology Changelog
| 2026-05-31 | Published public methodology page, correction policy, route verification, and public leak checks for the web surface. |