Index  |  Benchmarks  |  Mathematics  |  Graph  |  About
Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 5802 papers; mean review score 5.62/10; 1553 Zenodo DOIs.
Results 2926–2950 of 5802 entries

Papers

[2877]
3 June 2026. Score: 3.67/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does the inference latency and throughput of geodesic distance-based dense retrievers compare to Euclidean-based models when evaluated across the 18 heterogeneous datasets in BEIR. 0 claims were extracted…

[2876]
3 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518496

Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How does the performance of contrastive learning models in hyperbolic space for zero-shot cross-lingual retrieval vary with different language pairs in XOR-TyDi QA, measured by recall@k and NDCG. 8 claims were…

[2875]
3 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518492

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: Does replacing Euclidean distance with geodesic distance in dense retriever training improve zero-shot retrieval accuracy on the BEIR benchmark under domain shift conditions. 12 claims were extracted from source…

[2874]
3 June 2026. Score: 8.00/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518490

Abstract: This report synthesises findings from 2 peer-reviewed papers addressing the following research question: How do hyperbolic and Euclidean contrastive learning models scale with increasing model size and training data size in zero-shot cross-lingual retrieval for XOR-TyDi QA, measured by recall@k and NDCG. 5 claims…

[2873]
3 June 2026. Score: 8.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518473

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: What is the impact of different contrastive loss functions (e.g., InfoNCE, SupCon) on the performance of hyperbolic vs. Euclidean embeddings for cross-lingual retrieval in XOR-TyDi QA, evaluated with. 8 claims…

[2872]
3 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518453

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: Does the adoption of geodesic distance over cosine similarity improve the robustness of dense retrievers against adversarial query perturbations in out-of-distribution settings on the BEIR benchmark. 5 claims were…

[2871]
3 June 2026. Score: 3.83/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: What is the comparative robustness of manifold-based semantic scoring versus cosine similarity in cross-lingual open QA benchmarks when evaluated on low-resource languages. 15 claims were extracted from source…

[2870]
3 June 2026. Score: 8.67/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518445

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does conformal prediction for distribution shift estimation scale with model size in large language models trained on medical question-answering datasets. 6 claims were extracted from source literature; 6…

[2869]
3 June 2026. Score: 7.30/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: What is the correlation between context window length and pass@1 accuracy on code generation tasks for Gemini 1.5 models when multimodal inputs include executable video demonstrations. 11 claims were extracted…

[2868]
3 June 2026. Score: 7.07/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: What is the difference in classification accuracy and robustness between multimodal models trained on dimensional facial affect representations versus raw visual features for deception detection on. 12 claims…

[2867]
3 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518425

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How robust are manifold-aware distance metrics in cross-domain dense retrieval tasks, as measured by performance on the MTEB (Massive Text Embedding Benchmark) across different domains such as news,. 5 claims…

[2866]
3 June 2026. Score: 8.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518400

Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: Does the multi-turn conversation paradigm in LongNav-R1 improve robustness to partial observability in long-horizon tasks relative to chain-of-thought prompting on ALFRED. 8 claims were extracted from source…

[2865]
3 June 2026. Score: 7.57/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518393

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of horizon-adaptive multi-turn RL on the success rate of VLA models compared to single-turn baselines in the ALFRED dataset. 6 claims were extracted from source literature; 6 were independently…

[2864]
3 June 2026. Score: 8.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518386

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does replacing cosine similarity with geodesic distance metrics impact the robustness of dense retrievers on the Adversarial NLI benchmark under domain shift conditions. 9 claims were extracted from source…

[2863]
3 June 2026. Score: 9.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518356

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How does the alignment of multimodal Llama-2 models affect their performance on self-invoking code generation tasks in HumanEval Pro and MBPP Pro, as measured by the trade-off between inference. 13 claims were…

[2862]
3 June 2026. Score: 9.00/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518350

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How do multimodal Llama-2 extensions perform on HumanEval Pro and MBPP Pro compared to text-only models when evaluated on solution correctness and problem-solving latency in self-invoking code. 10 claims were…

[2861]
3 June 2026. Score: 9.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518344

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: Does the type-aware entity representation in NER Retriever improve retrieval throughput compared to standard DPR baselines on the BEIR benchmark while maintaining accuracy for rare entities. 9 claims were…

[2860]
3 June 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20518323

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of multimodal pre-training on the robustness of Llama-2 models in cross-domain code generation tasks, as measured by accuracy degradation when evaluated on HumanEval Pro and MBPP. 12 claims…

[2859]
3 June 2026. Score: 6.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: To what extent does model size (e.g., 7B vs. 13B vs. 70B) impact the efficiency of self-repair in Llama-2 models, evaluated by the trade-off between pass@1 accuracy and inference latency in code. 0 claims were…

[2858]
2 June 2026. Score: 4.07/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does the addition of multimodal context (e.g., natural language error messages or stack traces) improve the robustness of self-repair in Llama-2 models, measured by accuracy degradation in pass@k. 11 claims…

[2857]
2 June 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How does the cross-domain transferability of self-repair mechanisms in Llama-2 models scale with instruction-tuning data diversity, as measured by pass@k accuracy across different programming. 10 claims were…

[2856]
2 June 2026. Score: 5.50/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does the robustness of semi-supervised graph anomaly detection frameworks compare to fully unsupervised methods when evaluated on heterogeneous multi-view graph benchmarks under adversarial. 0 claims were…

[2855]
2 June 2026. Score: 7.00/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: Does integrating XSimGCL with large language model-encoded item descriptions improve out-of-domain generalization metrics compared to traditional ID-based embeddings on Steam dataset evaluations. 0 claims were…

[2854]
2 June 2026. Score: 2.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: What is the impact of different metapath sampling strategies (random vs. heuristic-based) on the convergence speed and final accuracy of HGNNs in multi-task learning settings (e.g., node. 10 claims were extracted…

[2853]
2 June 2026. Score: 3.83/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the impact of XSimGCL's contrastive loss weighting on inference throughput and precision-recall trade-offs when scaled to large-scale multimodal item datasets. 0 claims were extracted from source…

« Prev 1 116 117 118 119 120 233 Next »