Index  |  Benchmarks  |  Mathematics  |  Graph  |  About
Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 5765 papers; mean review score 5.63/10; 1553 Zenodo DOIs.
Results 2976–3000 of 5765 entries

Papers

[2790]
2 June 2026. Score: 3.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does the computational efficiency of RAG-based models scale with increasing code generation task complexity in the MBPP benchmark compared to parametric-only models measured in tokens per second. 14 claims…

[2789]
2 June 2026. Score: 2.40/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: Do recurrent-augmented graph neural networks demonstrate higher robustness than static attention-based variants when evaluated on synthetic subgraph enumeration tasks with varying levels of edge. 14 claims were…

[2788]
2 June 2026. Score: 3.17/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 3 peer-reviewed papers addressing the following research question: What is the trade-off between token generation speed and API call precision in agentic code completion systems using context-aware retrieval versus standard RAG pipelines. 0 claims were extracted from source…

[2787]
2 June 2026. Score: 6.00/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does instruction finetuning with chain-of-thought data affect the robustness of multilingual language models on out-of-distribution reasoning tasks across different language pairs. 0 claims were extracted…

[2786]
2 June 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does the choice of alignment technique (e.g., RLHF vs. supervised fine-tuning) affect the F1-score stability of Llama3 and Codestral under high levels of adversarial data contamination in code. 11 claims were…

[2785]
2 June 2026. Score: 6.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: Does the use of stratified sampling improve the robustness of Llama3 and Codestral in cross-domain code vulnerability detection, as evaluated by accuracy on a mixed-domain benchmark like MBXD. 0 claims were…

[2784]
2 June 2026. Score: 4.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of dynamic versus static channel-wise feature misalignment correction on the robustness of multimodal models when evaluated against adversarial perturbations in the MM-ReAct. 15 claims were…

[2783]
2 June 2026. Score: 3.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does fine-tuning multilingual M2QA models on domain-specific corpora impact their adversarial robustness scores compared to zero-shot cross-domain transfer. 10 claims were extracted from source literature; 0…

[2782]
2 June 2026. Score: 5.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: Does combining sparse and dense retrieval improve answer faithfulness metrics on the Telco-DPR table-heavy subcorpus relative to unimodal retrieval baselines. 11 claims were extracted from source literature; 3…

[2781]
2 June 2026. Score: 3.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does the integration of attention-based feature alignment modules in multimodal models compare to channel-wise misalignment correction in terms of accuracy and inference latency on the MM-ReAct. 17 claims were…

[2780]
2 June 2026. Score: 5.93/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does hybrid retrieval impact factual consistency scores on table-heavy subsets of the Telco-DPR benchmark compared to text-heavy subsets. 10 claims were extracted from source literature; 4 were independently…

[2779]
2 June 2026. Score: 5.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: What is the difference in retrieval accuracy between dense-only and hybrid methods when evaluated on the Telco-DPR benchmark's structured data splits. 17 claims were extracted from source literature; 5 were…

[2778]
2 June 2026. Score: 3.17/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the inference latency of MA-DPR compare to BM25 in RAG-based code generation pipelines when processing the HumanEval dataset at scale. 0 claims were extracted from source literature; 0 were independently…

[2777]
2 June 2026. Score: 4.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does the retrieval recall of MA-DPR versus BM25 correlate with code generation correctness when scaling context windows beyond 32k tokens in RAG systems. 9 claims were extracted from source literature; 1 was…

[2776]
2 June 2026. Score: 3.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of model size scaling on the inference efficiency and accuracy of MA-DPR-based RAG systems when evaluated on the AdversarialQA benchmark compared to lexical retrieval methods. 0 claims were…

[2775]
2 June 2026. Score: 6.23/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does adversarial contrastive learning with few-shot prompting compare to other robustness techniques (e.g., data augmentation, adversarial training) in improving pass@1 and pass@k on HumanEval. 10 claims were…

[2774]
2 June 2026. Score: 5.27/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How do retrieval-augmented generation models and zero-shot re-ranking approaches differ in hallucination rates when evaluated on knowledge-intensive reasoning tasks. 15 claims were extracted from source…

[2773]
2 June 2026. Score: 3.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the impact of domain shift on the robustness of retrieval-augmented generation versus zero-shot question generation re-ranking in cross-domain QA evaluations. 15 claims were extracted from source…

[2772]
2 June 2026. Score: 4.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does the inference latency of retrieval-augmented generation compare to zero-shot re-ranking methods on the TriviaQA benchmark when scaling to larger context windows. 0 claims were extracted from source…

[2771]
2 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20504479

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: Does increasing the number of negative samples in adversarial contrastive learning improve cross-lingual transfer efficiency for rumor detection when evaluated on multilingual question answering. 7 claims were…

[2770]
2 June 2026. Score: 3.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the impact of negative sample scaling in adversarial contrastive learning on the accuracy-robustness trade-off for low-resource language tasks in TyDi QA. The truth is significantly hampered by massive…

[2769]
2 June 2026. Score: 5.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the computational efficiency trade-off between Llama3, Codestral, and Deepseek R1 when performing vulnerability classification on the Big-Vul dataset, measured in tokens per second and. This study…

[2768]
2 June 2026. Score: 4.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: To what extent does adversarial prompting impact the cross-domain alignment stability of multilingual LLMs when measured by refusal rate consistency in technical domains versus general conversational. The growing…

[2767]
2 June 2026. Score: 3.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the vulnerability classification accuracy of Llama3, Codestral, and Deepseek R1 scale with increasing model size when evaluated on technical domain benchmarks like Big-Vul compared to. The rapid…

[2766]
2 June 2026. Score: 5.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: Can adversarial contrastive learning frameworks improve the robustness of multimodal rumor detection systems against text-image adversarial perturbations in cross-lingual settings. Malware remains a big threat to…

« Prev 1 118 119 120 121 122 231 Next »