Assignee Research: Index of Papers

Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 5888 papers; mean review score 5.61/10; 1554 Zenodo DOIs.

Results 2826–2850 of 5888 entries

Papers

[3063]

Conformal Prediction Set Trade-offs in Healthcare Language Models by Model Size

3 June 2026. Score: 9.00/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20521230

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: What is the impact of model size on the coverage-efficiency trade-off of conformal prediction sets for out-of-distribution detection in healthcare language tasks. 10 claims were extracted from source literature;…

[3062]

Multi-Turn RL Training Effects on VLA Agent Sample Efficiency and Convergence in ALFRED

3 June 2026. Score: 3.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the impact of multi-turn RL training on the sample efficiency and convergence speed of VLA agents performing long-horizon tasks in ALFRED. 0 claims were extracted from source literature; 0 were…

[3061]

Multi-Turn Conversation vs. Chain-of-Thought Prompting in LongNav-R1 on ALFRED

3 June 2026. Score: 6.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the multi-turn conversation paradigm in LongNav-R1 compare to chain-of-thought prompting in terms of success rate and path efficiency on the ALFRED benchmark under partial observability. 6 claims were…

[3060]

Geodesic Distance Retrieval vs. Cosine Similarity in Large-Scale Language Model Inference

3 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20521193

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of geodesic distance-based retrieval on inference latency and throughput compared to cosine similarity in large-scale language model applications. 5 claims were extracted from source…

[3059]

Horizon-Adaptive Multi-Turn Reinforcement Learning for Robust VLA Models in ALFRED

3 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20521189

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: Does horizon-adaptive multi-turn RL improve the robustness of VLA models to environmental perturbations and instruction ambiguity in the ALFRED benchmark relative to supervised single-turn approaches. 7 claims…

[3058]

Type-Aware Entity Representations Enhance Cross-Domain NER Retrieval on FEVER

3 June 2026. Score: 6.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: Does the type-aware entity representation in NER Retriever improve cross-domain generalization for rare entities on the FEVER benchmark compared to standard DPR baselines. 0 claims were extracted from source…

[3057]

Horizon-Adaptive Multi-Turn Reinforcement Learning in Vision-Language-Action Models on ALFRED

3 June 2026. Score: 3.50/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does horizon-adaptive multi-turn reinforcement learning affect the task success rate of Vision-Language-Action models on the ALFRED dataset compared to single-turn baselines. 9 claims were extracted from…

[3056]

Qwen3 Thinking and Non-Thinking Modes Performance on HumanEval Pro and MBPP Pro

3 June 2026. Score: 5.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does the integration of thinking and non-thinking modes in Qwen3 affect its performance on HumanEval Pro and MBPP Pro benchmarks, as measured by pass@k accuracy and latency trade-offs compared to. 0 claims…

[3055]

Geodesic Distance Metrics Enhance Robustness in Dense Retrievers Under Domain Shift

3 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20521004

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does replacing cosine similarity with geodesic distance metrics affect the robustness scores of dense retrievers on the Adversarial NLI benchmark under domain shift. 10 claims were extracted from source…

[3054]

Multimodal Pre-Training Impact on Llama-2 Robustness in MBPP Pro Benchmark

3 June 2026. Score: 3.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: What is the comparative robustness of Llama-2 models with and without multimodal pre-training when evaluated on non-adversarial versus adversarial inputs in the MBPP Pro benchmark, measured by. 14 claims were…

[3053]

Entity-Aware Attention Mechanisms in RAG Improve Rare Entity Retrieval on BEIR

3 June 2026. Score: 8.27/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520958

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the integration of entity-aware attention mechanisms in RAG models impact the retrieval precision for rare entities on the BEIR benchmark compared to standard DPR baselines. 11 claims were extracted from…

[3052]

Dynamic Entity Representation Impact on RAG Inference Latency and Retrieval Effectiveness in MS MARCO

3 June 2026. Score: 3.67/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: To what extent does dynamic entity representation in NER Retriever affect the inference latency of RAG models on the MS MARCO benchmark while maintaining retrieval effectiveness. 9 claims were extracted from…

[3051]

Self-Repair Efficacy in Llama-2 Models Across Instruction-Tuning Data Scales

3 June 2026. Score: 3.33/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does the efficacy of self-repair in Llama-2 models scale with instruction-tuning data size, measured by HumanEval pass@1 accuracy and token efficiency in code generation tasks. 0 claims were extracted from…

[3050]

Self-Repair Inference Latency and Accuracy Trade-offs in Llama-2 Across Task Complexities

3 June 2026. Score: 7.90/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520849

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the inference latency of self-repair in Llama-2 models vary with task complexity (e.g., single-function vs. multi-file code generation), and what trade-offs exist between accuracy and. 5 claims were…

[3049]

Instruction-Tuning Data Diversity and Cross-Domain Zero-Shot Code Generation in Llama-2 Models

3 June 2026. Score: 3.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How does the diversity of instruction-tuning data affect the cross-domain zero-shot code generation capability of Llama-2 models, as measured by pass@1 accuracy on HumanEval across Python,. 0 claims were extracted…

[3048]

Structured Diagram Embeddings Enhance Multimodal Code Generation Pass@k Metrics

3 June 2026. Score: 8.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520791

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the integration of structured diagram representations (e.g., graph embeddings) with code generation tasks in multimodal models improve pass@k metrics compared to raw image-based reasoning on. 10 claims…

[3047]

Human-Labeled Visual Instruction Tasks Enhance Multimodal Reasoning in Flan-VLMs

3 June 2026. Score: 8.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520785

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: What is the impact of incorporating human-labeled visual instruction tasks on the multimodal reasoning performance of Flan-VLMs, as evaluated by VQA accuracy on OK-VQA and GQA benchmarks. 6 claims were extracted…

[3046]

Impact of Known Normal Node Percentage on Graph Anomaly Detection Convergence and Efficiency

3 June 2026. Score: 7.40/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of varying the percentage of known normal nodes on the convergence speed and inference efficiency of generative semi-supervised graph anomaly detection models. 17 claims were extracted from…

[3045]

Generative Semi-Supervised Graph Anomaly Detection in Cross-Domain Transfer Scenarios

3 June 2026. Score: 8.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520734

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How do generative semi-supervised graph anomaly detection methods perform in cross-domain transfer scenarios compared to unsupervised baselines when evaluated on multi-view graph benchmarks. 7 claims were…

[3044]

Sequential Embeddings and LLM Text Encoders in Zero-Shot Cross-Domain Recommendation

3 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520716

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does integrating item2vec-style sequential embeddings with large language model text encoders impact zero-shot recommendation accuracy on cross-domain datasets compared to pure ID-based. 9 claims were…

[3043]

Robustness of Metapath Context Convolution HGNNs to Noisy and Adversarial Metapaths

3 June 2026. Score: 8.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520711

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How robust are Metapath Context Convolution-based HGNNs to noisy or adversarial metapaths in heterogeneous graphs, as evaluated by link prediction F1 scores on corrupted versions of citation datasets. 5 claims…

[3042]

Metapath Sampling Granularity Effects on HGNN Performance in Multi-Task Learning

3 June 2026. Score: 7.60/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520703

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: What is the impact of different metapath sampling granularities (e.g., coarse vs. fine-grained) on the performance of Metapath Context Convolution-based HGNNs in multi-task learning benchmarks like. 6 claims were…

[3041]

Scaling XSimGCL Contrastive Loss in Billion-Parameter Multimodal Recommenders

3 June 2026. Score: 9.00/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20520701

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: What is the trade-off between inference throughput (in samples/second) and recommendation precision (e.g., Recall@K) when scaling XSimGCL's contrastive loss weighting to billion-parameter multimodal. 9 claims were…

[3040]

Semantic Text Augmentation and Graph Perturbations in Robust Contrastive Recommendations Under Sparsity

3 June 2026. Score: 1.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the effect of semantic text augmentation strategies versus structural graph perturbations on the robustness of contrastive recommendation models under data sparsity conditions. 0 claims were extracted…

[3039]

Metapath Context Convolution and Transformers vs. Traditional HGNNs in Citation Graph Node Classification

3 June 2026. Score: 3.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does the integration of Metapath Context Convolution with transformers compare to traditional HGNNs in terms of node classification accuracy and inference latency on citation graphs like ACM or. 11 claims…

« Prev 1 … 112 113 114 115 116 … 236 Next »