Assignee Research: Index of Papers

Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 5483 papers; mean review score 5.63/10; 1474 Zenodo DOIs.

Results 3176–3200 of 5483 entries

Papers

[2308]

Multimodal Model Hallucination Rates and Context Window Length Under Adversarial Noise

1 June 2026. Score: 4.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the correlation between context window length and hallucination rates in multimodal models when evaluated on adversarial noise injections within the MMNeedle dataset. Large Language Models (LLMs) have…

[2307]

Semantic Guidance in Adversarial Training for Cross-Dataset Code Model Robustness

1 June 2026. Score: 3.00/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: Does applying semantic guidance in adversarial training improve cross-dataset robustness for code models when evaluated on MBPP versus CodexGLUE. Predicting the trajectories of surrounding objects is a critical…

[2306]

Sentence-T5 and MPNet Embedding Fusion for Cross-Domain Retrieval on HotpotQA

1 June 2026. Score: 2.40/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: Does combining Sentence-T5 and MPNet embeddings improve cross-domain retrieval accuracy on HotpotQA when models are trained exclusively on TriviaQA. Modern information retrieval (IR) models, trained exclusively on…

[2305]

Graph-Enhanced Multimodal Models vs. Flat Fusion in MM-Vet Throughput-Accuracy Trade-offs

1 June 2026. Score: 4.40/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How do graph-enhanced multimodal models scale in terms of throughput versus accuracy compared to flat fusion models under non-adversarial conditions on MM-Vet. Robot vision has greatly benefited from advancements…

[2304]

Hybrid Embeddings in Tree of Reviews Enhance Robustness in Multi-Hop QA Retrieval

1 June 2026. Score: 6.87/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 1 peer-reviewed paper addressing the following research question: Does the use of hybrid embeddings in Tree of Reviews improve robustness against distractor documents in multi-hop QA datasets compared to single-embedding retrieval methods. The Portable Document Format (PDF) is…

[2303]

Horizon-Adaptive VLA Policies Outperform Fixed-Horizon Baselines in Long-Horizon Navigation

1 June 2026. Score: 5.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does the inference latency of horizon-adaptive VLA policies in LongNav-R1 compare to fixed-horizon baselines across varying navigation episode lengths. This paper develops LongNav-R1, an end-to-end multi-turn…

[2302]

Horizon-Adaptive Multi-Turn Reinforcement Learning for Robust VLA Navigation Under Visual Obscurations

1 June 2026. Score: 5.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: What is the impact of horizon-adaptive multi-turn reinforcement learning on the robustness of VLA navigation policies against visual obscurations in simulated 3D environments. This paper develops LongNav-R1, an…

[2301]

Multi-Turn Iterative Preference Learning vs. Supervised Fine-Tuning for Zero-Shot Mathematical Reasoning

1 June 2026. Score: 3.17/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does the multi-turn iterative preference learning approach compare to Supervised Fine-Tuning (SFT) in improving zero-shot generalization of LLMs on mathematical reasoning tasks, as measured by. Recent studies…

[2300]

Multi-Turn vs. Single-Turn RL Sample Efficiency in Vision-Language-Action Models on Habitat-3D

1 June 2026. Score: 6.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the sample efficiency of multi-turn RL frameworks like LongNav-R1 compare to single-turn baselines when scaling Vision-Language-Action models on the Habitat-3D benchmark. This paper develops LongNav-R1,…

[2299]

Multi-Turn RL vs. Single-Turn Imitation Learning in VLA Navigation Sample Efficiency

1 June 2026. Score: 4.33/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the sample efficiency of multi-turn RL training for VLA navigation compare to single-turn imitation learning as trajectory length increases. This paper develops LongNav-R1, an end-to-end multi-turn…

[2298]

LongNav-R1 Horizon-Adaptive Inference Efficiency Across Varying Environmental Complexity

1 June 2026. Score: 5.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the inference efficiency of LongNav-R1's horizon-adaptive framework scale with increasing environmental complexity in Habitat 2.0 compared to other multi-turn RL methods, measured in frames. This paper…

[2297]

Dynamic Unit Test Scaling Effects on RL Training Stability and Pass@k Performance

1 June 2026. Score: 3.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: What is the impact of dynamically scaling the number of generated unit tests per candidate solution on the training stability and final pass@k scores of reinforcement learning from execution feedback. Optimistic…

[2296]

DPO with Rationales vs. Standard DPO: Training Throughput and Sample Complexity Trade-offs on LLaVA-Bench

1 June 2026. Score: 2.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: What is the efficiency trade-off in terms of training throughput and sample complexity when comparing DPO with rationales versus standard DPO on the LLaVA-Bench benchmark, evaluated by the number of. Aligning…

[2295]

Explicit Rationales in Preference Data Stabilize Pass@1 Performance on Hard GSM8K Problems

1 June 2026. Score: 2.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: Does integrating explicit rationales into preference data reduce the variance in pass@1 scores for hard-tier GSM8K problems compared to standard DPO. Aligning language models with human preferences through…

[2294]

Difficulty-Based Preference Data Scaling Enhances Robustness in Code Generation

1 June 2026. Score: 4.73/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: Does scaling the number of preference data samples in difficulty-based DPO improve robustness to adversarial perturbations in self-invoking code generation, as evaluated on a modified HumanEval Pro. Aligning…

[2293]

Sliding Window Attention Effects on Long-Sequence Code LLM Throughput and Memory

1 June 2026. Score: 1.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 3 peer-reviewed papers addressing the following research question: What is the impact of sliding window attention mechanisms on inference throughput and memory usage for sequence lengths exceeding 32k tokens in code LLMs. Recent advances in language modeling have demonstrated the…

[2292]

Synthetic Graph Generation for GNN Generalization in Low-Density Data Regimes

1 June 2026. Score: 8.23/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20488337

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: To what extent do synthetic graph generation techniques improve the generalization of graph neural networks in low-density regimes compared to standard train-test splits. Graph Neural Networks (GNNs) are one of…

[2291]

Data Augmentation Strategies and Robustness in Graph Anomaly Detection Metrics

1 June 2026. Score: 8.00/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20488220

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How does the choice of data augmentation strategy impact the robustness of F1 and AUC metrics for graph anomaly detection models across varying graph densities. Detecting anomalies in data is a vital task, with…

[2290]

Chain-of-Thought Prompting Effects on LLM Cross-Domain Reasoning from Finance to Law

1 June 2026. Score: 4.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does chain-of-thought prompting impact cross-domain reasoning accuracy of LLMs when transferring from financial text corpora to legal document analysis. Financial news sentiment analysis is crucial for…

[2289]

Multimodal Alignment Latency Overhead in Vision-Language Models on Low-Resource Hardware

1 June 2026. Score: 4.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the inference latency overhead of multimodal alignment techniques in large vision-language models when evaluated on low-resource hardware benchmarks. Robot vision has greatly benefited from advancements…

[2288]

Fine-Tuning Impact on LLM Code Generation: Proprietary vs. Open-Source Datasets

1 June 2026. Score: 4.73/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: To what extent does code generation performance degrade in LLMs when fine-tuned on proprietary software repositories versus open-source GitHub datasets. Large Language Models (LLMs) have demonstrated significant…

[2287]

LightGCL, SGL, and GCA Sample Efficiency in Low-Data Recommendation Benchmarks

1 June 2026. Score: 3.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does the sample efficiency of LightGCL compare to SGL and GCA in low-data regimes on MovieLens and Amazon recommendation benchmarks. Graph neural network (GNN) is a powerful learning approach for graph-based…

[2286]

Graph Contrastive Learning Efficiency and Accuracy Under Sparse Interaction Graphs

1 June 2026. Score: 3.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does the efficiency and accuracy of graph contrastive learning methods scale with increasing sparsity in interaction graphs, as evaluated by HR and MAP metrics on large-scale datasets. Contrastive learning…

[2285]

Simplified Augmentation Pipelines in Graph Contrastive Learning Across Interaction Graph Sparsity

1 June 2026. Score: 3.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: Do simplified augmentation pipelines in graph contrastive learning maintain robust performance across different sparsity levels in interaction graphs when measured by HR and MAP compared to complex. Attributed…

[2284]

Adversarial Robustness of XSimGCL vs. Graph Contrastive Learning Methods on Yelp and Amazon

1 June 2026. Score: 3.00/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How does the adversarial robustness of XSimGCL compare to other graph contrastive learning methods when evaluated on the Yelp and Amazon review datasets using NDCG@10 and NDCG@20 as metrics. Contrastive learning…

« Prev 1 … 126 127 128 129 130 … 220 Next »