Assignee Research: Index of Papers

Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 5387 papers; mean review score 5.65/10; 1473 Zenodo DOIs.

Results 3276–3300 of 5387 entries

Papers

[2112]

Dynamic Reward Scaling vs. Human-Crafted Unit Tests in Code Generation Benchmarks

1 June 2026. Score: 5.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does dynamic reward scaling perform relative to human-crafted unit tests in terms of code correctness and inference latency when evaluated on the HumanEval and SQuTR benchmarks using a fixed. Current large…

[2111]

Dynamic Scaling of Unit Tests for Efficient LLM Code Generation and Benchmark Performance

1 June 2026. Score: 6.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of using dynamically scaled unit tests on the inference efficiency (e.g., FLOPs, latency) of LLMs during code generation, and how does it correlate with solution correctness in. Current large…

[2110]

Multi-Stage Validation Frameworks for LLM Code Generation: Reward Accuracy and Training Stability

1 June 2026. Score: 6.43/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does the multi-stage validation framework with progressively complex unit tests compare to human-written test suites (e.g., HumanEval, MBXP) in terms of reward signal accuracy and training. Current large…

[2109]

Unit Test Complexity and Dynamic Reward Scaling in Code Generation Performance

1 June 2026. Score: 6.13/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the impact of varying the unit test complexity (e.g., simple vs. multi-step assertions) on the trade-off between inference efficiency and solution accuracy in dynamic reward scaling. Current large…

[2108]

Scalability of DPO and RLHF in Large Multimodal Reasoning Benchmarks

1 June 2026. Score: 3.07/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the scalability of DPO compare to RLHF in terms of training throughput when applied to large multimodal reasoning benchmarks like MMBench or SEED-Bench. This paper studies the alignment process of…

[2107]

KL-Constraint Hyperparameter Effects on RLHF and DPO Alignment in Corrupted Multimodal Benchmarks

1 June 2026. Score: 2.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: What is the impact of varying KL-constraint hyperparameters on the alignment performance of RLHF versus DPO when evaluated on corrupted image-text pairs from multimodal benchmarks like LLaVA-Bench. Aligning…

[2106]

Difficulty-Based Preference Dataset Scaling and Its Impact on Model Performance in GSM8K and MATH

1 June 2026. Score: 4.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: What is the impact of scaling the difficulty-based preference dataset size (e.g., 1K to 100K samples) on model performance on the GSM8K or MATH benchmarks, and how does this compare to scaling data. Aligning…

[2105]

Explicit Rationales in Preference Data Enhance Alignment Consistency Across GSM8K Difficulty Tiers

1 June 2026. Score: 3.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: Does integrating explicit rationales into preference data improve the consistency of alignment across different question difficulty tiers in the GSM8K mathematical reasoning benchmark compared to. Aligning…

[2104]

Difficulty-Based Preference Data Selection Enhances DPO for Code Generation Alignment

1 June 2026. Score: 5.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: Does incorporating difficulty-based preference data selection with DPO lead to better alignment on code generation tasks (HumanEval, MBPP) compared to standard DPO, as measured by pass@1 score and. We introduce…

[2103]

Rationale-Augmented Preference Optimization Enhances Few-Shot Learner Robustness to Syntactic Adversarial Attacks

1 June 2026. Score: 4.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of rationale-augmented preference optimization on the robustness of few-shot learners against syntactic adversarial attacks across different model scales. State-of-the-art few-shot learning…

[2102]

Sliding Window vs. Full Attention in Code Generation: Token-Level Accuracy on Long-Context Benchmarks

1 June 2026. Score: 3.17/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does sliding window attention in code generation models affect token-level accuracy on long-context programming benchmarks compared to full attention. Transformers are quickly becoming one of the most heavily…

[2101]

Scaling Node Perturbations and Edge Modifications in Graph Neural Networks for Network Intrusion Detection

1 June 2026. Score: 4.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does the scaling of node feature perturbations versus edge structure modifications affect the inference efficiency and error rates of graph neural networks in network intrusion detection tasks. Cybersecurity…

[2100]

Scaling Efficiency and Robustness Trade-offs in GNN-Based NIDS via Gradient Bypass

1 June 2026. Score: 4.07/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does the computational efficiency of bypassing obfuscated gradients in GNN-based NIDS models scale with increasing network size, and what is the trade-off between robustness and inference time on. Machine…

[2099]

Obfuscated Gradients in GNN-Based NIDS Against Structural Adversarial Attacks on KDD Cup 99

1 June 2026. Score: 3.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How do obfuscated gradients in GNN-based NIDS models compare to other gradient masking techniques in terms of their robustness against structural adversarial attacks on the KDD Cup 99 dataset, as. The integration…

[2098]

Comparative Degradation of GNN-Based NIDS Under Adversarial Perturbations Across Datasets

1 June 2026. Score: 4.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the comparative degradation in classification accuracy and robustness scores of GNN-based NIDS models under universal adversarial perturbations when trained on CIC-IDS2017 and evaluated on. Intrusion…

[2097]

Adversarial Transferability of Gradient-Obfuscation Attacks Across Graph-Based Tasks

1 June 2026. Score: 4.93/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the cross-domain transferability of attack techniques that circumvent obfuscated gradients in GNN-based NIDS models when applied to other graph-based tasks, such as node classification in. Intrusion…

[2096]

Multimodal and Unimodal GNN Performance Under Varying Graph Size and Heterogeneity

1 June 2026. Score: 8.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20482922

Abstract: This report synthesises findings from 3 peer-reviewed papers addressing the following research question: What is the impact of graph size and heterogeneity on the classification accuracy and convergence speed of multimodal versus unimodal GNNs, as measured on benchmarks such as the Open Graph Benchmark. It is a long…

[2095]

Multimodal vs. Unimodal Graph Neural Networks in Large-Scale Heterogeneous Graph Inference

1 June 2026. Score: 8.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20482918

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How do multimodal graph neural networks compare to unimodal GNNs in terms of inference latency and memory efficiency when evaluated on large-scale heterogeneous graph benchmarks like PDNS-Net under. In this paper…

[2094]

Multi-View Aggregation in Graph Anomaly Detection: Scalability, F1 Scores, and Latency Trade-offs

1 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20482916

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does multi-view aggregation in graph anomaly detection frameworks affect the F1 score and inference latency when scaling from single-edge devices to distributed edge computing environments. Machine learning…

[2093]

Quantization and Pruning Effects on GNN-Based Anomaly Detection in Edge Devices

1 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20482908

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the impact of quantization techniques on inference latency and F1 score for GNN-based anomaly detection models deployed on resource-constrained devices, and how does it compare to model. Unlike previous…

[2092]

Reverse-KL Regularization in RLHF Mitigates Multimodal Reasoning Degradation Under Adversarial Perturbations

1 June 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20482901

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: Does incorporating reverse-KL regularization during RLHF training reduce performance degradation on multimodal reasoning tasks when evaluated on adversarially perturbed VQA datasets. Recently, ChatGPT, along with…

[2091]

Memory Replay Integration in GNN-Based Anomaly Detection Throughput and Performance Trade-offs

1 June 2026. Score: 8.00/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20482899

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the throughput impact of integrating memory replay mechanisms into GNN-based anomaly detection systems, measured by inference latency and F1 score trade-offs on the UNSW-NB15 dataset. Given the…

[2090]

Multimodal vs. Unimodal Models in Continual Learning: Accuracy Retention on Sequential Datasets

1 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20482897

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How do multimodal models (e.g., CLIP or LXMERT) perform in continual learning scenarios compared to unimodal models, as measured by accuracy retention on sequential datasets like Visual Genome or. Multimodal…

[2089]

Reverse-KL Regularization in RLHF Enhances Robustness of Vision-Language Models on VQA-Adv

1 June 2026. Score: 6.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does reverse-KL regularization in RLHF impact the robustness of vision-language models against adversarial perturbations on the VQA-Adv benchmark. In the last few years, the deep learning (DL) computing…

[2088]

Reverse-KL Regularization in Contextual Bandits for Noisy OCR Accuracy Improvement

1 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: To what extent does the reverse-KL regularized contextual bandit approach improve OCR accuracy under noisy input conditions compared to standard KL penalties on OCR-VQA. Concept drift primarily refers to an online…

« Prev 1 … 130 131 132 133 134 … 216 Next »