Index  |  Benchmarks  |  Mathematics  |  Graph  |  About
Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 5998 papers; mean review score 5.58/10; 1557 Zenodo DOIs.
Results 2626–2650 of 5998 entries

Papers

[3373]
4 June 2026. Score: 4.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does the few-shot learning accuracy of Llama 3.1 compare to Mistral 7B on time-series forecasting benchmarks when restricted to low-rank adaptation fine-tuning. 9 claims were extracted from source literature;…

[3372]
4 June 2026. Score: 6.73/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: Evaluating the inference efficiency of Llama3.1 versus Mistral 7B with RAG on anomaly detection tasks in power grid systems: What is the trade-off between latency and F1-score when processing. 7 claims were…

[3371]
4 June 2026. Score: 5.17/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: Comparison of Llama3.1 and Mistral 7B in power grid anomaly detection: How does fine-tuning on battery datasets affect their robustness (F1-score) on downstream tasks when integrated with. 0 claims were extracted…

[3370]
4 June 2026. Score: 7.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20536319

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How do Codestral-7B and Codestral-70B compare in terms of false positive rates and tokens-per-second efficiency when evaluating smart contract vulnerabilities under high-concurrency inference. 9 claims were…

[3369]
4 June 2026. Score: 3.07/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: What is the correlation between model scale and false positive rates in Llama3 variants when performing vulnerability detection on OWASP benchmark tasks under adversarial perturbations. 11 claims were extracted…

[3368]
4 June 2026. Score: 8.23/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20536311

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: What is the correlation between batch size scaling and latency degradation for Codestral models when performing static analysis code classification. 7 claims were extracted from source literature; 7 were…

[3367]
4 June 2026. Score: 4.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does instruction fine-tuning on domain-specific security datasets impact the inference efficiency and detection accuracy trade-off for Llama3-7B and Llama3-70B on obfuscated code samples. 12 claims were…

[3366]
4 June 2026. Score: 4.33/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the inference latency of Deepseek R1 scale relative to code structural complexity during vulnerability scanning on the Big-Vul benchmark. 0 claims were extracted from source literature; 0 were…

[3365]
4 June 2026. Score: 5.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does the robustness of Llama3-7B versus Llama3-70B to synthetic code obfuscation vary across different vulnerability classes in the SARD dataset when measured by F1-score degradation. 14 claims were extracted…

[3364]
4 June 2026. Score: 3.00/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does Deepseek R1's vulnerability detection accuracy on Big-Vul correlate with cyclomatic complexity metrics compared to Llama3 and Codestral. 0 claims were extracted from source literature; 0 were…

[3363]
4 June 2026. Score: 5.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the impact of variable renaming and control flow flattening on the F1 scores of Llama3 versus Codestral when evaluated on the Big-Vul dataset. 15 claims were extracted from source literature; 3 were…

[3362]
4 June 2026. Score: 5.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the quantization of Deepseek R1 impact its throughput and false positive rate when classifying CVEs in the Big-Vul benchmark. 16 claims were extracted from source literature; 5 were independently…

[3361]
4 June 2026. Score: 3.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does the accuracy of Deepseek R1 in vulnerability classification vary across different programming languages when evaluated on a standardized dataset like Big-Vul. 11 claims were extracted from source…

[3360]
4 June 2026. Score: 5.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: What is the effect of code-specific data augmentation on the pass@1 scores of code generation models across diverse programming language datasets. 9 claims were extracted from source literature; 2 were…

[3359]
4 June 2026. Score: 3.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does curriculum-based multi-task learning affect the cross-domain generalization accuracy of large multimodal models on the RadNet benchmark compared to standard joint training. 12 claims were extracted from…

[3358]
4 June 2026. Score: 3.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of curriculum-based multi-task learning on the inference latency and throughput of large multimodal models evaluated on the RadNet medical image-text dataset. 13 claims were extracted from…

[3357]
4 June 2026. Score: 4.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: What is the impact of replacing fully connected CRF post-processing with attention-based refinement modules on Dice coefficient scores for brain tumor segmentation. 0 claims were extracted from source literature;…

[3356]
4 June 2026. Score: 6.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How do transformer-based architectures scale in terms of GPU memory efficiency versus accuracy compared to hybrid CNN-CRF models when processing multi-modal MRI volumes. 0 claims were extracted from source…

[3355]
4 June 2026. Score: 6.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the inference latency of KANs compare to traditional MLPs when evaluated on the HellaSwag reasoning benchmark for language models. 0 claims were extracted from source literature; 0 were independently…

[3354]
4 June 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does the robustness of KANs against adversarial attacks compare to MLPs when measured using the FGSM attack success rate on the CIFAR-10 dataset. 16 claims were extracted from source literature; 2 were…

[3353]
4 June 2026. Score: 3.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the accuracy gap between KANs and transformers on the ImageNet-1K benchmark when trained with identical computational budgets. 9 claims were extracted from source literature; 0 were independently verified…

[3352]
4 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20536130

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How does the inference latency of 3D CNN-CRF hybrids compare to Vision Transformer variants on high-resolution 3D medical imaging benchmarks. 5 claims were extracted from source literature; 5 were independently…

[3351]
4 June 2026. Score: 3.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does varying batch size during adversarial training impact the F1 score of Codestral on syntax-perturbed MBPP benchmarks compared to standard training methods. 0 claims were extracted from source literature;…

[3350]
4 June 2026. Score: 2.40/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: Does adversarial training with different batch sizes improve the cross-domain generalization of Codestral as measured by accuracy on unseen code generation benchmarks like HumanEval. 8 claims were extracted from…

[3349]
4 June 2026. Score: 8.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20536091

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the correlation between parameter-efficient fine-tuning methods and the retention of multi-language code synthesis capabilities measured by pass@1 on MultiPL-E. 10 claims were extracted from source…

« Prev 1 104 105 106 107 108 240 Next »