Index  |  Benchmarks  |  Mathematics  |  Graph  |  About
SRCH:CD4BF95C

How does 4-bit quantization influence the alignment of LLaMA 3.2 and Mistral with human preferences in code re

Submitted: 10 June 2026
Review score: 6.50/10
Verification: L1, Literature synthesis
Quality tier: Watchlist

Abstract

Abstract: Large language models have shown remarkable aptitude in code generation, but still struggle to perform complex tasks. Self-repair – in which the model debugs and repairs its own code – has recently become a popular way to boost performance in these settings. However, despite its increasing popularity, existing studies of self-repair have been limited in scope; in many settings, its efficacy thus remains poorly understood. In this paper, we analyze Code Llama, GPT-3.5 and GPT-4's ability to perform self-repair on problems taken from HumanEval and APPS. We find that when the cost of carrying o

Research Question

How does 4-bit quantization influence the alignment of LLaMA 3.2 and Mistral with human preferences in code repair tasks as evaluated by the TruthfulQA benchmark?

Verification Level

Paper levelL1, Literature synthesis
Source-grounded claims0
Claim record sourcenot publicly specified

Descriptive public verification status only; aggregate claim counts are public, but individual claim records are not exposed here.

Quality Tier

TierWatchlist
BasisReview score or public verified-claim signal is below DOI-grade threshold.

Descriptive public triage only; this tier does not alter current publication or DOI behavior.

Quality Dimensions

Evidence strength LOW
Uncertainty disclosure MEDIUM
Reproducibility status MEDIUM

Automated triage signals derived from public fields; not human peer review or independent validation.

Correction Record

StatusCURRENT
Correction count0
Manifest contractpaper-manifest-v1.1
Correction contractcorrection-record-v1

Public corrections are additive records. Current status does not claim the synthesis is error-free.

Provenance

PublisherAssignee Research
Public provenanceL2, Public artifact record
Report artifactAvailable
External recordNot registered
Claim lineage0 aggregate source-grounded claims
Review methodAutomated multi-reviewer assessment
Quality guideHow to read scores, claims, manifests, and evidence links
Provenance contractsource-provenance-v1
NoteMachine-generated synthesis of existing literature. Not primary research.