Papers
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of diverse human feedback types on reward model accuracy when trained using configurable interfaces like RLHF-Blender. 9 claims were extracted from source literature; 9 were independently…
Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does the integration of diagram-based visual reasoning tasks in HumanEval-V impact the performance accuracy of multimodal code generation models compared to text-only benchmarks like HumanEval. 11 claims were…
Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the inference latency trade-off when scaling multimodal models to handle complex diagram interpretations in HumanEval-V tasks with increasing diagram complexity. 9 claims were extracted from source…
Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How do MGAT and DGAT models trained on PDNS-Net generalize to other large-scale heterogeneous network datasets in terms of both accuracy preservation and computational efficiency degradation. 10 claims were…
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of self-invoking code generation tasks on the adversarial robustness of large language models when evaluated using pass@1 metrics on extended HumanEval datasets. 6 claims were extracted from…
Abstract: This report synthesises findings from 2 peer-reviewed papers addressing the following research question: What is the latency performance difference between MGAT/DGAT and baseline GNN models on PDNS-Net across varying batch sizes and node feature dimensionalities for real-time network security. 6 claims were extracted…
Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does the pass@1 performance of GCN-enhanced code generation models degrade under PGD attacks compared to transformer baselines on the HumanEval and MBPP benchmarks. 17 claims were extracted from source…
Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the impact of fine-tuning Vision-Language Models on specialized scientific datasets versus general corpora on their hallucination rates and factual consistency in image-captioning tasks. 12 claims were…
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of replacing heavy vision encoders with lightweight architectures on COCO Captioning performance when integrated with GCN-based fusion layers. 11 claims were extracted from source literature;…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How do GCN-enhanced multimodal models compare to transformer-only baselines in terms of throughput and robustness against adversarial text-image perturbations on the Hateful Memes dataset. 12 claims were extracted…
Abstract: This report synthesises findings from 2 peer-reviewed papers addressing the following research question: What is the impact of scaling dataset size on the generalization performance of CAGN-GAT Fusion versus traditional graph neural networks (e.g., GAT, GCN) in detecting unseen attack types in intrusion. 17 claims…
Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: How does the robustness of CAGN-GAT Fusion compare to autoencoder-based models in intrusion detection when evaluated under adversarial graph perturbations using the accuracy drop ratio metric on GIN,. 12 claims…
Abstract: This report synthesises findings from 1 peer-reviewed paper addressing the following research question: How does the inference efficiency of CAGN-GAT Fusion compare to standard graph attention networks (GAT) in intrusion detection tasks, as measured by throughput benchmarks on GPU and CPU environments. 12 claims were…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How do multimodal motion capture systems combining sparse IMUs and stereo cameras (e.g., Stereo-Inertial Poser) compare to monocular-visual-inertial systems in terms of pose estimation accuracy (MSE). 14 claims…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How does the use of Quaternion GANs to generate synthetic IMU data affect the training efficiency of Deep Inertial Poser compared to real-valued GANs, as measured by convergence speed and final MSE. 8 claims were…
Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: What is the impact of biased evaluation protocols on the reported alignment performance of multimodal models across different dataset contamination levels. 9 claims were extracted from source literature; 9 were…
Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: What is the impact of combining Quaternion GANs with diffusion models for synthetic IMU data generation on the robustness of Deep Inertial Poser to adversarial perturbations in input motion. 10 claims were…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: What is the impact of applying MSCR-based data augmentation on the inference efficiency and alignment stability of code generation models under adversarial prompt perturbations. 14 claims were extracted from…
Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: Can the minimal class separation distance framework be adapted to evaluate the reasoning robustness of multimodal models against noisy image-text pair annotations. 11 claims were extracted from source literature;…
Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: How does the MSCR robustness distance metric correlate with accuracy degradation in large language models under synthetic token corruption benchmarks. 8 claims were extracted from source literature; 7 were…
Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: Does the unified temporal-structural approach in DyTEGNN maintain accuracy gains over separate temporal models when evaluated on dynamic link prediction tasks with varying graph densities. 9 claims were extracted…
Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How does the memory efficiency of MECCH compare to GIN when processing large-scale heterogeneous graphs like OGB-LSC in terms of GPU memory usage per edge. 6 claims were extracted from source literature; 6 were…
Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: What is the performance trade-off between MECCH and GAT in terms of F1-score on node classification tasks in MAG240M when varying the metapath length from 1 to 5. 9 claims were extracted from source literature; 9…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How does the inference efficiency (latency, FLOPs) of GNNs pre-trained with MoCL compare to those pre-trained with traditional graph autoencoders (e.g., VGAE, GAE) when evaluated on multi-task. 12 claims were…
Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: Does the introduction of large-scale heterogeneous graph data mitigate alignment issues in multimodal models that integrate structural graph information with text embeddings. 9 claims were extracted from source…