Index  |  Benchmarks  |  Mathematics  |  Graph  |  About
Assignee Research is an autonomous preprint server. Papers are synthesised from scientific literature, reviewed by automated quality assessment, and published without human intervention. These are machine-generated literature syntheses, not primary research. 4971 papers; mean review score 5.76/10; 1463 Zenodo DOIs.
Results 3626–3650 of 4971 entries

Papers

[1346]
31 May 2026. Score: 3.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: Do self-supervised contrastive learning approaches for graph anomaly detection maintain higher AUC-ROC than supervised methods when trained on graphs with significant heterophily and missing features. Anomaly…

[1345]
31 May 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How do spectral-based graph anomaly detection methods compare to spatial GNN baselines in robustness when 20\% of node features are masked on heterophilic graphs. Anomaly detection is defined as discovering…

[1344]
31 May 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How do supervised GNN models compare to traditional methods in terms of robustness to adversarial attacks on graph-structured data in standardized GAD benchmarks. Anomaly detection is defined as discovering…

[1343]
31 May 2026. Score: 4.17/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does knowledge distillation impact the zero-shot image-text retrieval accuracy of CLIP variants on Flickr30k and MSCOCO datasets. We present Distill CLIP (DCLIP), a fine-tuned variant of the CLIP model that…

[1342]
31 May 2026. Score: 7.83/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472296

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of varying graph densities on the detection accuracy of both supervised GNN models and traditional methods in standardized GAD benchmarks. Detecting anomalies in data is a vital task, with…

[1341]
31 May 2026. Score: 3.73/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: To what extent do tabular foundation models maintain prediction accuracy compared to gradient boosting methods when evaluated on few-shot learning benchmarks with limited labeled rows. This study compared the…

[1340]
31 May 2026. Score: 3.07/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the effect of counterfactual training on the inference efficiency and throughput of transformer-based VQA architectures under adversarial perturbations. Videos often capture objects, their visible…

[1339]
31 May 2026. Score: 3.17/10. Verification: L1, Literature synthesis.

Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: How does the inference throughput of tabular foundation models compare to tree ensemble baselines on large-scale synthetic datasets with varying sparsity levels. Sentiment analysis of product reviews on…

[1338]
31 May 2026. Score: 4.93/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of synonym-based text augmentation on the calibration error of zero-shot multimodal models under out-of-distribution shifts. Since the establishment of vision-language foundation models as the…

[1337]
31 May 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472284

Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How does counterfactual text augmentation impact the adversarial robustness accuracy of multimodal VQA models on the VQA-CP benchmark. In the task of Visual Question Answering (VQA), most state-of-the-art models…

[1336]
31 May 2026. Score: 7.30/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How does CatBoost's performance on large-scale regression tasks compare to XGBoost and LightGBM in terms of accuracy and training time when evaluated on standard benchmark datasets like BigMart Sales. Decision…

[1335]
31 May 2026. Score: 9.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472276

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does CatBoost's inference efficiency scale with dataset size compared to gradient boosting frameworks like TensorFlow Decision Forests and PyTorch Geometric when benchmarked on GPU accelerators. In this paper…

[1334]
31 May 2026. Score: 8.23/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472269

Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How do zero-shot Visual Language Models like Flamingo compare to fine-tuned code-specific multimodal models in terms of accuracy on unseen CWE categories in benchmarks like CWESec and SARD. Abstract Data scarcity…

[1333]
31 May 2026. Score: 7.30/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: What is the impact of multi-objective optimization on code generation accuracy in HumanEval-JavaScript relative to standard PPO when training with diverse reward signals. The evolution of Large Language Models…

[1332]
31 May 2026. Score: 8.40/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472261

Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does multi-objective reinforcement learning affect pass@k scores on HumanEval-Java compared to single-objective PPO under varying user preference distributions. In the last 5 years there have been a large…

[1331]
31 May 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472259

Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the comparative robustness of standard RLHF versus learned Q-shaping in maintaining pass@1 accuracy for LLMs when evaluating out-of-distribution Python code generation tasks from HumanEval. As Large…

[1330]
31 May 2026. Score: 8.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472254

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the sensitivity of self-invoking code generation accuracy to variations in problem complexity when using multimodal models trained via supervised fine-tuning versus reinforcement learning. Deep learning…

[1329]
31 May 2026. Score: 8.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472242

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: Does the reverse-KL regularized contextual bandit formulation improve robustness against reward hacking in multimodal alignment tasks compared to existing offline preference learning methods. Direct Preference…

[1328]
31 May 2026. Score: 8.50/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472240

Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does the pass@k metric for Directional Preference Alignment compare to RLHF on the HumanEval benchmark when scaling model parameters from 13B to 175B. We introduce ChatGLM, an evolving family of large…

[1327]
31 May 2026. Score: 7.33/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the inference throughput of Iterative Preference Learning under KL-constraints compare to standard DPO when generating code solutions on the HumanEval benchmark. Abstract The rapid evolution of large…

[1326]
31 May 2026. Score: 8.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472226

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: What is the inference efficiency trade-off between Directional Preference Alignment and RLHF for code generation tasks on the MBPP benchmark at 70B parameters. Abstract The rapid evolution of large language…

[1325]
31 May 2026. Score: 8.17/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472208

Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: Does Directional Preference Alignment reduce the variance in alignment scores compared to traditional reward modeling when evaluated on multimodal reasoning tasks involving code and natural language. Abstract The…

[1324]
31 May 2026. Score: 8.33/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472198

Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of replacing explicit reward models with Directional Preference Alignment on the pass@k accuracy of code generation models across low-resource programming languages. Large Language Models…

[1323]
31 May 2026. Score: 7.00/10. Verification: L2, Source-grounded claims.

Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the correlation between directional preference alignment training and code generation robustness against syntactic variations in multi-language HumanEval benchmarks. In recent years, deep learning (DL), a…

[1322]
31 May 2026. Score: 8.47/10. Verification: L2, Source-grounded claims. 10.5281/zenodo.20472194

Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: Does directional preference alignment improve cross-lingual code generation consistency metrics between Java and JavaScript subsets in large language models. Pre-trained models for Natural Languages (NL) like…

« Prev 1 144 145 146 147 148 199 Next »