Assignee Research: Index of Papers

Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 6081 papers; mean review score 5.57/10; 1557 Zenodo DOIs.

Results 2451–2475 of 6081 entries

Papers

[3631]

Quantized MobileVLM vs Full-Precision VLMs on MME and MM1K Benchmarks

5 June 2026. Score: 4.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How do quantized MobileVLM models (1.4B and 2.7B) compare to full-precision 3B-13B VLMs on MME and MM1K benchmarks in terms of reasoning accuracy and inference latency. 8 claims were extracted from source…

[3630]

MobileVLM and State-of-the-Art VLMs on MM1K Under Low-Resource Robotic Manipulation

5 June 2026. Score: 3.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: What is the performance gap between MobileVLM and state-of-the-art VLMs on the MM1K benchmark when evaluated under low-resource settings (e.g., 5-shot learning) for robotic manipulation tasks. 8 claims were…

[3629]

Adaptive Reasoning Suppression and Speculative Decoding Performance on GSM8K with Llama-3-8B

5 June 2026. Score: 3.40/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: How does Adaptive Reasoning Suppression compare to Speculative Decoding in terms of GSM8K accuracy and throughput for Llama-3-8B models. 10 claims were extracted from source literature; 0 were independently…

[3628]

Top-K Sampling Strategies in ESP Token Tree Construction for DeepSeek-V3 Code Completion

5 June 2026. Score: 4.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: What is the impact of different Top-K sampling strategies in ESP's speculative token tree construction on the code completion accuracy of DeepSeek-V3 across the HumanEval and MBPP benchmarks. 12 claims were…

[3627]

DeepSeek-V3 Multi-Token Prediction vs. Next-Token Baselines in Low-Resource Code Completion

5 June 2026. Score: 3.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does the performance of DeepSeek-V3's multi-token prediction objective compare to standard next-token prediction on code completion accuracy in low-resource programming languages using the. 16 claims were…

[3626]

Rationale-Augmented Preference Data Enhances DPO Robustness on AlpacaEval 2.0

5 June 2026. Score: 9.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563631

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: Does training on rationale-augmented preference data improve the robustness of DPO-aligned models against adversarial prompts on the AlpacaEval 2.0 benchmark compared to standard PPO alignment. 6 claims were…

[3625]

Instruction Tuning Enhances Low-Density Region Detection in 7B-8B Parameter LLMs

5 June 2026. Score: 8.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563623

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: To what extent does instruction tuning impact the ability of 7B-8B parameter LLMs to identify low-density regions in tabular data compared to their base pre-trained counterparts using F1-score metrics. 8 claims…

[3624]

Robustness of Mistral 7B and Llama 3.1 8B to Distribution Shifts in Tabular Anomaly Detection

5 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563621

Abstract: This report synthesises findings from 3 peer-reviewed papers addressing the following research question: How robust are Mistral 7B and Llama 3.1 8B to distribution shifts in tabular anomaly detection tasks when measured by the area under the precision-recall curve across different noise levels. 8 claims were…

[3623]

Explicit Rationales in Preference Datasets Boost DPO Win Rate Scaling on AlpacaEval 2.0

5 June 2026. Score: 8.57/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563618

Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How does the inclusion of explicit rationales in preference datasets impact the win rate scaling of DPO compared to PPO on AlpacaEval 2.0 for 7B versus 70B parameter models. 6 claims were extracted from source…

[3622]

MobileVLM Accuracy on MME and MM1K Against Quantized 3B–13B VLMs

5 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563611

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How does the accuracy of MobileVLM's 1.4B and 2.7B models on the MME and MM1K benchmarks compare to quantized versions of larger 3B to 13B VLMs. 12 claims were extracted from source literature; 12 were…

[3621]

Zero-Shot Anomaly Detection in Tabular Data: Llama 3.1 8B vs. Mistral 7B Performance

5 June 2026. Score: 3.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does the zero-shot anomaly detection precision-recall performance of Llama 3.1 8B compare to Mistral 7B when evaluated on synthetic tabular datasets with varying degrees of feature correlation. 0 claims were…

[3620]

Neuro-Symbolic Logical Constraint Complexity and Verification Robustness Under Adversarial Perturbations

5 June 2026. Score: 7.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563602

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the correlation between the complexity of neuro-symbolic logical constraints and verification accuracy degradation under adversarial perturbations in formal proof datasets. 10 claims were extracted from…

[3619]

MobileVLM Throughput Latency Scaling on Heterogeneous Mobile Hardware

5 June 2026. Score: 9.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563596

Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: How does the throughput latency of MobileVLM's efficient projector architecture scale when deployed on heterogeneous mobile hardware compared to standard transformer-based projectors. 9 claims were extracted from…

[3618]

Quantization-Aware Training Preserves Multimodal Accuracy in Vision-Language Models

5 June 2026. Score: 6.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does quantization-aware training influence multimodal benchmark performance on ScienceQA compared to post-training quantization. 15 claims were extracted from source literature; 9 were independently verified…

[3617]

Reinforcement Learning from Human Feedback Enhances Bayesian Network Condition Monitoring in Dynamic Environments

5 June 2026. Score: 8.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563591

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: Can reinforcement learning from human feedback (RLHF) improve Bayesian Network-based condition monitoring systems' performance in dynamic environments as measured by real-time risk assessment accuracy. 8 claims…

[3616]

Multimodal Fusion Techniques for Enhanced Failure Detection Accuracy in Wind Energy Conversion Systems

5 June 2026. Score: 7.73/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563585

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the impact of multimodal fusion techniques on the accuracy of failure detection in WECSs when evaluated against SCADA system benchmarks. 9 claims were extracted from source literature; 8 were…

[3615]

GRACE vs. Quantization-Aware Training Methods in Multimodal Vision-Language Models

5 June 2026. Score: 8.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563583

Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How does the performance of GRACE compare to other quantization-aware training methods on the MMBench and COCO-Text benchmarks in terms of multimodal alignment accuracy and inference latency. 5 claims were…

[3614]

GRACE Quantization-Aware Training Scaling in 3B-to-13B Vision-Language Models

5 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563577

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does GRACE's quantization-aware training scale with model size, and how does it affect performance on the MME and MM1K benchmarks when applied to VLMs with 3B to 13B parameters. 8 claims were extracted from…

[3613]

Dynamic Quantization and Static Quantization in Multimodal Alignment on VQA v2

5 June 2026. Score: 8.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563569

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does dynamic quantization affect multimodal alignment performance on the VQA v2 dataset compared to static quantization methods. 14 claims were extracted from source literature; 12 were independently verified…

[3612]

DeepSeek-V3 Activation Sparsity and Accuracy Trade-offs in Multi-Step Reasoning Tasks

5 June 2026. Score: 8.00/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563562

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does the activation sparsity of DeepSeek-V3's 37B active parameters correlate with accuracy degradation on multi-step reasoning tasks in the MMLU and BBH datasets relative to dense model. 10 claims were…

[3611]

Dynamic Quantization of Attention Layers and Its Impact on HumanEval Pass@1 Performance

5 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563560

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does dynamic quantization of attention layers impact pass@1 scores on the HumanEval benchmark for code generation models. 9 claims were extracted from source literature; 9 were independently verified against…

[3610]

DeepSeek-V3 Multi-Token Prediction Performance on Code Generation Benchmarks

5 June 2026. Score: 7.20/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: What is the comparative performance of DeepSeek-V3's multi-token prediction training objective on code generation benchmarks like HumanEval and MBPP versus standard next-token prediction baselines. 15 claims were…

[3609]

Vision-Language Models vs. CNNs in Document Recognition Under Adversarial Attacks

5 June 2026. Score: 7.63/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563550

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: What is the comparative accuracy degradation of vision-language models versus standalone CNN architectures on document recognition tasks under structured adversarial attacks. 9 claims were extracted from source…

[3608]

On-The-Job Learning Enhances Dialogue System Robustness in Unseen Scenarios

5 June 2026. Score: 7.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563548

Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: Does on-the-job learning improve robustness against unseen conversational scenarios in dialogue systems as measured by ConvEval failure rates. 10 claims were extracted from source literature; 9 were independently…

[3607]

DeepSeek-V3 Auxiliary-Loss-Free Load Balancing in Long-Context Reasoning Benchmarks

5 June 2026. Score: 8.23/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20563546

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does DeepSeek-V3's auxiliary-loss-free load balancing strategy impact token throughput and latency on long-context reasoning benchmarks compared to traditional MoE routing mechanisms. 10 claims were extracted…

« Prev 1 … 97 98 99 100 101 … 244 Next »