Assignee Research: Index of Papers

Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 5429 papers; mean review score 5.65/10; 1474 Zenodo DOIs.

Results 3226–3250 of 5429 entries

Papers

[2204]

W4A4 vs. INT8 Quantization Impact on Llama-2-7B HumanEval Performance and H100 Latency

1 June 2026. Score: 3.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does W4A4 quantization affect the HumanEval pass@1 score of Llama-2-7B compared to INT8 quantization while maintaining real-time inference latency on NVIDIA H100 GPUs. Reducing the latency and model size has…

[2203]

CodeT5 Performance in Cross-Domain Code Completion Across Python and Java

1 June 2026. Score: 7.33/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How do CodeT5 models perform in cross-domain code completion tasks (e.g., Python to Java) compared to domain-specialized models, and what metrics (e.g., BLEU, accuracy) best capture these differences. Benchmark…

[2202]

Real-Time Vulnerability Classification Trade-offs in CodeT5 and State-of-the-Art Code Models

1 June 2026. Score: 2.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does the trade-off between real-time vulnerability classification accuracy and throughput compare between CodeT5 models and other state-of-the-art code language models (e.g., CodeGen, CodeGPT). Many ML-based…

[2201]

DeepSeek R1 Robustness to LoRA Interference vs Full Fine-Tuning on MBPP

1 June 2026. Score: 3.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does the robustness of Deepseek R1 to interference in LoRA-based fine-tuning compare to full fine-tuning when evaluated on the MBPP benchmark in terms of accuracy and latency. Recently, the instruction-tuning…

[2200]

Scalability of CodeT5-Based Vulnerability Detection in IDEs for Incremental Code Changes

1 June 2026. Score: 3.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the scalability of CodeT5-based vulnerability detection in IDE environments when processing incremental code changes versus full-file analysis, as measured by latency per code edit and. In the rapidly…

[2199]

CodeT5 Vulnerability Detection: IDE Integration vs Standalone Performance Trade-offs

1 June 2026. Score: 3.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the integration of CodeT5-based vulnerability detection into IDE environments compare to standalone processing in terms of token-level latency and GPU memory utilization when evaluated on. Texture…

[2198]

LoRA Orthogonality Effects on Codestral Inference Throughput in HumanEval

1 June 2026. Score: 2.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: What is the impact of imperfect orthogonality in LoRA-based fine-tuning on the inference throughput of Codestral when evaluated on the HumanEval benchmark. Pre-training Large Language Models (LLMs) on web-scale…

[2197]

Spreading Factor Impact on Llama3 QLoRA Fine-Tuning Performance in QuixBugs

1 June 2026. Score: 4.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How does the choice of spreading factor (SF) in LoRa modulation affect the F1-score performance of Llama3 on QuixBugs when using QLoRA fine-tuning compared to full fine-tuning. Pre-training Large Language Models…

[2196]

DeepSeek R1, Llama3, and Codestral Trade-offs in Vulnerability Detection Efficiency and Accuracy

1 June 2026. Score: 4.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the trade-off between inference efficiency (latency, throughput) and F1-score performance for Llama3, Codestral, and Deepseek R1 when deployed for vulnerability detection across multiple. This study…

[2195]

Fine-Tuning Hyperparameters and Cross-Language Generalization in Code LLMs on Big-Vul

1 June 2026. Score: 2.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does the choice of fine-tuning hyperparameters impact the cross-language generalization performance of Llama3, Codestral, and Deepseek R1 on Big-Vul, as measured by F1-score gaps between seen and. Anomaly…

[2194]

Vendi-RAG Diversity-Weight Impact on FLAN-T5-xl Alignment with Human Preferences

1 June 2026. Score: 4.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the diversity-weight parameter in Vendi-RAG influence the alignment of FLAN-T5-xl outputs with human preferences on the TruthfulQA benchmark compared to BM25 retrieval. Retrieval-augmented generation…

[2193]

Vendi-RAG Diversity-Weight Tuning and Its Effects on FLAN-T5-xl Code Generation Accuracy

1 June 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of varying the diversity-weight parameter in Vendi-RAG on the accuracy of FLAN-T5-xl for code generation tasks in the HumanEval benchmark compared to BM25 retrieval. Retrieval-augmented…

[2192]

Full-Graph vs. Mini-Batch Training Robustness in Adversarial Graph Neural Networks

1 June 2026. Score: 6.60/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: Does the choice between full-graph and mini-batch training pipelines affect the robustness of Graph Neural Networks against adversarial perturbations in control flow graphs used for security analysis. Malware…

[2191]

System-Level Optimizations for Mini-Batch GNN Training in Multimodal Vulnerability Detection

1 June 2026. Score: 2.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: To what extent do system-level optimizations for mini-batch GNN training improve inference throughput and memory efficiency when deploying multimodal vulnerability detectors on resource-constrained. Graph Neural…

[2190]

DeepSeek R1 and Codestral Memory and Latency with IceCache vs. On-GPU KV-Cache

1 June 2026. Score: 3.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the memory consumption and latency of DeepSeek R1 and Codestral compare when using IceCache's KV-cache management against traditional on-GPU KV-cache in autoregressive generation tasks with. Key-Value…

[2189]

Mini-Batch vs Full-Graph Training in GNNs for Code Vulnerability Detection

1 June 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does mini-batch training impact the convergence speed and final accuracy of Graph Neural Networks for code vulnerability detection compared to full-graph training on large-scale software. Full-graph and…

[2188]

Throughput-Accuracy Trade-offs in Code Generation with DeepSeek R1 and Codestral on Constrained Hardware

1 June 2026. Score: 3.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: What is the trade-off between throughput and accuracy when deploying DeepSeek R1 and Codestral with IceCache's external vector database-based KV-cache on resource-constrained hardware for code. Key-Value (KV)…

[2187]

Code Property Graphs with Commit Messages Enhance Robustness in Vulnerability Detection

1 June 2026. Score: 4.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the robustness of vulnerability detection models trained on code property graphs with integrated commit messages vary against adversarial code perturbations compared to models using only. Deep Neural…

[2186]

Convolutional and Graph Neural Networks for Code Property Graph Analysis with Commit Embeddings

1 June 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: What is the effect of using convolutional neural networks versus graph neural networks on the inference efficiency and detection accuracy when processing code property graphs augmented with commit. The increasing…

[2185]

Natural Language Commit Messages Enhance Vulnerability Detection in Code Property Graphs

1 June 2026. Score: 6.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does incorporating natural language commit messages into code property graph representations impact the F1-score of vulnerability detection models on the Big-Vul dataset compared to graph-only. A commit…

[2184]

Synthetic Claude 2 Data Training Enhances Small Language Model Robustness in Adversarial NLI

1 June 2026. Score: 3.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: To what extent does training on synthetic data from Claude 2 improve the cross-domain robustness of small language models on adversarial NLI datasets compared to ChatGPT-3.5-Turbo. Natural Language Inference…

[2183]

Inference Latency of Fine-Tuned Video Encoders on Synthetic vs. Human Gesture Data

1 June 2026. Score: 3.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does the inference latency of large pre-trained video encoders change when fine-tuned on synthetic gesture data versus human-annotated datasets across varying batch sizes. In this work, we explore the…

[2182]

Synthetic Video Training Effects on Zero-Shot Cross-Domain Generalization in Video-Language Models

1 June 2026. Score: 5.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: To what extent does training on synthetic video data impact the zero-shot cross-domain generalization accuracy of multimodal video-language models compared to models trained on real-world annotations. In this…

[2181]

Synthetic Data Quality Effects on Small Language Model Inference Efficiency

1 June 2026. Score: 4.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the impact of synthetic data source quality on the inference efficiency and throughput of small language models trained for natural language inference tasks. The evolution of Generative Pre-trained…

[2180]

Small Language Model Reasoning Accuracy on NLI After Synthetic Data Fine-Tuning

1 June 2026. Score: 4.07/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does the reasoning accuracy of small language models on NLI benchmarks change when fine-tuned on synthetic data from ChatGPT-3.5-Turbo compared to data from ChatGPT-4. Large Language Models (LLMs) have…

« Prev 1 … 128 129 130 131 132 … 218 Next »