Index |  Research ▾  |  Verification ▾  | About
Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 8290 papers; mean review score 5.73/10; 2267 Zenodo DOIs. Verified contributions (Gate 2: formal proof or sandbox reproduction): 150. 87 claims falsified by the pipeline (see falsification record). 169 published AI claims under field audit; 92 contested by the literature itself (see audit ledger). 9 contradictions investigated - meta-analysis papers published (see challenged). What does this mean?
Results 8276–8290 of 8290 entries

Papers

[15]
27 May 2026. Score: 7.50/10. Verification: L2, Source-grounded claims. Gate status: Unverified.

Abstract: To produce a domain-agnostic question answering model for the Machine Reading Question Answering (MRQA) 2019 Shared Task, we investigate the relative benefits of large pre-trained language models, various data sampling strategies, as well as query and context paraphrases generated by back-translation. We find a…

[14]
27 May 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20409932

Abstract: Recent advances in test-time scaling of large language models (LLMs), exemplified by DeepSeek-R1 and OpenAI's o1, show that extending the chain of thought during inference can significantly improve general reasoning performance. However, the impact of this paradigm on legal reasoning remains insufficiently explored.…

[13]
27 May 2026. Score: 7.50/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20409874

Abstract: Increasing test-time compute for LLMs shows promise across domains but remains underexplored in code generation, despite extensive study in math. In this paper, we propose S*, the first hybrid test-time scaling framework that substantially improves the coverage and selection accuracy of generated code. S* extends the…

[12]
27 May 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20409804

Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge to answer questions more accurately. However, research on evaluating RAG systems-particularly the retriever component-remains limited, as most existing work focuses on single-context retrieval rather than multi-hop…

[11]
27 May 2026. Score: 7.50/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20409686

Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge to answer questions more accurately. However, research on evaluating RAG systems-particularly the retriever component-remains limited, as most existing work focuses on single-context retrieval rather than multi-hop…

[10]
27 May 2026. Score: 7.50/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20409196

Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge to answer questions more accurately. However, research on evaluating RAG systems-particularly the retriever component-remains limited, as most existing work focuses on single-context retrieval rather than multi-hop…

[9]
27 May 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20408526

Abstract: Extractive reading comprehension question answering (QA) datasets are typically evaluated using Exact Match (EM) and F1-score, but these metrics often fail to fully capture model performance. With the success of large language models (LLMs), they have been employed in various tasks, including serving as judges…

[8]
27 May 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20408396

Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge to answer questions more accurately. However, research on evaluating RAG systems-particularly the retriever component-remains limited, as most existing work focuses on single-context retrieval rather than multi-hop…

[7]
27 May 2026. Score: 7.67/10. Verification: L2, Source-grounded claims. Gate status: Unverified.

Abstract: There has been a rapid progress in the task of Visual Question Answering with improved model architectures. Unfortunately, these models are usually computationally intensive due to their sheer size which poses a serious challenge for deployment. We aim to tackle this issue for the specific task of Visual Question…

[6]
27 May 2026. Score: 7.50/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20408050

Abstract: While Large Language Models (LLMs) excel in question-answering (QA) tasks, their real reasoning abilities on multiple evidence retrieval and integration on Multi-hop QA tasks remain less explored. Firstly, LLMs sometimes generate answers that rely on internal memory rather than retrieving evidence and reasoning in…

[5]
27 May 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20407901

Abstract: Multimodal Mixture-of-Experts (MoE) models offer a promising path toward scalable and efficient large vision-language systems. However, existing approaches rely on rigid routing strategies (typically activating a fixed number of experts per token) ignoring the inherent heterogeneity in semantic importance across…

[4]
27 May 2026. Score: 7.23/10. Verification: L2, Source-grounded claims. Gate status: Unverified.

Abstract: Vision-language foundation models achieve promising performance in natural image classification, yet their direct application to medical imaging is limited by severe domain shifts, resolution mismatches, and the multi-label nature of clinical diagnosis. Training dedicated medical foundation models from scratch,…

[3]
27 May 2026. Score: 7.40/10. Verification: L2, Source-grounded claims. Gate status: Unverified.

Abstract: Mixture-of-Experts (MoE) has become a prevalent backbone for large vision-language models (VLMs), yet how modality-specific signals should guide expert routing remains under-explored. Existing routing strategies are either hand-crafted or modality-agnostic, relying on idealized priors that ignore the layer-dependent…

[2]
27 May 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20406928

Abstract: Mixture-of-Experts (MoE) has become a prevalent backbone for large vision-language models (VLMs), yet how modality-specific signals should guide expert routing remains under-explored. Existing routing strategies are either hand-crafted or modality-agnostic, relying on idealized priors that ignore the layer-dependent…

[1]
27 May 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. Gate status: Unverified. 10.5281/zenodo.20406733

Abstract: Mixture-of-Experts (MoE) has become a prevalent backbone for large vision-language models (VLMs), yet how modality-specific signals should guide expert routing remains under-explored. Existing routing strategies are either hand-crafted or modality-agnostic, relying on idealized priors that ignore the layer-dependent…

« Prev 1 330 331 332 Next »