Index  |  Benchmarks  |  Mathematics  |  Graph  |  About
Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 5933 papers; mean review score 5.60/10; 1556 Zenodo DOIs.
Results 2701–2725 of 5933 entries

Papers

[3233]
4 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535456

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How does speculative decoding impact the vulnerability detection accuracy of Deepseek R1 on high cyclomatic complexity code compared to standard autoregressive decoding. 6 claims were extracted from source…

[3232]
4 June 2026. Score: 8.00/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535448

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: To what extent do data augmentation strategies improve the generalization of deep learning models on small-scale datasets compared to transfer learning from large-scale pre-trained weights. 9 claims were…

[3231]
4 June 2026. Score: 7.57/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535434

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: What is the comparative memory footprint and inference latency of multi-task trained vision-language models versus single-task baselines on low-resource medical datasets. 10 claims were extracted from source…

[3230]
4 June 2026. Score: 9.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535432

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the robustness of CNN architectures to adversarial perturbations compare when evaluated using structural similarity metrics versus standard accuracy on image classification benchmarks. 9 claims were…

[3229]
4 June 2026. Score: 7.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: Can sparse attention mechanisms improve the inference efficiency of large multimodal models on augmented medical image-text pairs, as measured by throughput and memory usage on the MM-Imagenet. 8 claims were…

[3228]
4 June 2026. Score: 7.47/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How does curriculum-based multi-task learning affect the inference latency and accuracy of large multimodal models on sparse medical image-text pairs, as evaluated on the MedQA or R2D2 benchmarks. 8 claims were…

[3227]
4 June 2026. Score: 6.33/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 3 peer-reviewed papers addressing the following research question: How does synthetic data augmentation impact the few-shot learning convergence rates of multimodal vision-language models on specialized medical imaging benchmarks. 0 claims were extracted from source literature; 0…

[3226]
4 June 2026. Score: 8.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535415

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: To what extent does fine-tuning for adversarial robustness degrade the BLEU and ROUGE scores of Llama3 and Codestral when generating documentation for vulnerable code segments. 6 claims were extracted from source…

[3225]
4 June 2026. Score: 1.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: To what extent does the deterministic output of the MFOUR Vibe Framework improve the robustness of Codestral against adversarial code perturbations relative to standard sampling methods. 0 claims were extracted…

[3224]
4 June 2026. Score: 3.17/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How does the MFOUR Vibe Framework impact the inference latency and throughput of Llama3 compared to baseline stochastic decoding in code generation benchmarks. 0 claims were extracted from source literature; 0…

[3223]
4 June 2026. Score: 7.73/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535393

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the difference in robustness scores between Llama3 and Deepseek R1 when evaluated on adversarially perturbed code generation benchmarks. 9 claims were extracted from source literature; 8 were…

[3222]
4 June 2026. Score: 8.23/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535391

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How do alignment techniques influence the trade-off between code generation accuracy and adversarial robustness in recent open-weight language models. 10 claims were extracted from source literature; 9 were…

[3221]
4 June 2026. Score: 8.23/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535382

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the impact of reasoning-focused training on the jailbreak resistance of code-generating LLMs when evaluated on malware prompt datasets. 5 claims were extracted from source literature; 5 were independently…

[3220]
4 June 2026. Score: 5.43/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does the robustness of instruction-tuned Llama3 compare to Deepseek R1 against taxonomy-specific adversarial perturbations in code security benchmarks. 5 claims were extracted from source literature; 5 were…

[3219]
4 June 2026. Score: 7.40/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the pass@1 performance of Codestral compare to Llama3 on HumanEval-X for low-resource programming languages when fine-tuned with 10\% of the original dataset. 8 claims were extracted from source…

[3218]
4 June 2026. Score: 8.40/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535365

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How does alignment tuning in Llama3 and Deepseek R1 impact code generation accuracy on the LDOT benchmark compared to untuned baselines. 5 claims were extracted from source literature; 5 were independently…

[3217]
4 June 2026. Score: 6.90/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: What is the comparative robustness of Llama3, Codestral, and Deepseek R1 in classifying vulnerabilities within the Big-Vul dataset under varying levels of code obfuscation. 7 claims were extracted from source…

[3216]
4 June 2026. Score: 6.70/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: Can optimization techniques like speculative decoding mitigate the accuracy drop-off in Deepseek R1 when processing adversarial code with high cyclomatic complexity. 13 claims were extracted from source…

[3215]
4 June 2026. Score: 3.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 3 peer-reviewed papers addressing the following research question: What is the effect of model quantization levels on the token throughput and fault classification performance of Llama3.1 and Mistral 7B when applied to battery management system datasets. 0 claims were extracted…

[3214]
4 June 2026. Score: 6.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of using R-squared as the primary evaluation metric on the robustness of regression-based machine learning models compared to traditional metrics like SMAPE and MAE in benchmark. 4 claims were…

[3213]
4 June 2026. Score: 7.77/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535335

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: Can curriculum-based multi-task learning improve the inference efficiency and alignment stability of large multimodal models trained on augmented sparse medical image-text pairs. 9 claims were extracted from…

[3212]
4 June 2026. Score: 8.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535332

Abstract: This report synthesises findings from 2 peer-reviewed papers addressing the following research question: How do inference efficiency and detection accuracy trade-offs differ between Llama3 and Codestral when fine-tuned for adversarial robustness in C++ and Python vulnerability scanning. 11 claims were extracted from…

[3211]
4 June 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20535328

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: What is the impact of multimodal pre-training (e.g., image-text models like FLAN-PaLM) on downstream code generation tasks, as evaluated by pass@1 and execution accuracy on HumanEval and MBPP. 9 claims were…

[3210]
4 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does multi-task learning with synthetic image augmentation affect the convergence speed and memory footprint of vision-language models on low-resource medical imaging benchmarks. 4 claims were extracted from…

[3209]
4 June 2026. Score: 1.17/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: What is the comparative robustness of Llama3 and Codestral against adversarial code perturbations in multilingual vulnerability detection tasks. 0 claims were extracted from source literature; 0 were independently…

« Prev 1 107 108 109 110 111 238 Next »