Papers
Abstract: This report synthesises findings from 1 peer-reviewed paper addressing the following research question: How does the choice of LoRA rank (e.g., 4, 8, 16) impact the cross-domain generalization of Wan2.1 I2V-14B when evaluated on FVD and LPIPS across diverse human video synthesis datasets like HuVAE or. Similarity…
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How does the LoRA rank scaling in cross-attention layers affect the inference efficiency (in tokens/second) of Wan2.1 I2V-14B compared to full fine-tuning on downstream video synthesis tasks. With the…
Abstract: This report synthesises findings from 1 peer-reviewed paper addressing the following research question: How does the joint latent space compression in W.A.L.T's causal encoder compare to specialized latent diffusion models like Stable Diffusion Video in terms of Frchet Inception Distance (FID) and KL. Video…
Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How does the causal encoder design in W.A.L.T impact downstream performance when integrated with state-of-the-art multimodal models like Flamingo or PaLI on video captioning benchmarks (e.g.,. Multimodal learning…
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How do quantized DeepCoNN models perform relative to full-precision alternatives in cross-domain recommendation scenarios (e.g., e-commerce vs. social media) under strict latency constraints. In recent years,…
Abstract: This report synthesises findings from 16 peer-reviewed papers addressing the following research question: What is the trade-off between inference efficiency (tokens/sec) and alignment performance when joint modeling user reviews in recommendation tasks using LLMs. Effectively modeling the dynamic nature of user…
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How robust are LLM-based recommendation agents to noisy or adversarial review data when evaluated on cross-domain benchmarks like HUMAN-EVAL-R for code generation. Current evaluation frameworks and benchmarks for…
Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the synthetic data composition in phi-3-mini's training affect its performance degradation on out-of-distribution benchmarks compared to models trained primarily on natural web data. In this work, we…
Abstract: This report synthesises findings from 1 peer-reviewed paper addressing the following research question: How does 4-bit versus 8-bit quantization impact the HumanEval pass@1 scores of code generation models when evaluated on different programming languages. Democratization of AI is an important topic within the…
Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the adversarial robustness of GADT3 compare to other graph diffusion models like GDM or GDE under targeted node feature perturbations, measured by AUC-ROC on synthetic and real-world traffic. Timely…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How do Deepseek R1 and Codestral compare in inference latency and token generation accuracy when evaluated on the Qiskit HumanEval benchmark across different quantum circuit complexity levels. Large Language…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: What is the computational efficiency trade-off between GADT3 and traditional GCN-based traffic prediction models when defending against adversarial graph structure attacks, measured by inference. We trained a…
Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: To what extent can knowledge distillation from large language models improve the inference efficiency of small language models in code generation tasks, as evaluated by latency and pass@k metrics on. In the last…
Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: What is the relationship between activation sparsity ratios and code generation accuracy degradation in state-spaces/lm-eval-harness when pruning to cold neurons only in PowerInfer's pipeline. This paper…
Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the performance of GADT3 scale with increasing graph size and complexity, measured by detection accuracy and training time, compared to other self-supervised GAD methods. Multilayer neural networks…
Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: What is the impact of varying the ratio of labeled to unlabeled anomalies on the detection accuracy of GADT3 when applied to multimodal graph data. Detecting anomalies in data is a vital task, with numerous…
Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the computational efficiency and scalability of Mul-GAD compared to other state-of-the-art GNN-based anomaly detection models when applied to large-scale graph datasets like Reddit and Twitter. Anomaly…
Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: How does the inference efficiency of GADT3 compare to other test-time training frameworks in cross-domain graph anomaly detection across different graph densities and sizes. Anomaly detection is defined as…
Abstract: This report synthesises findings from 11 peer-reviewed papers addressing the following research question: How robust is Mul-GAD to noisy or incomplete graph data compared to other cross-domain graph anomaly detection models when evaluated on perturbed versions of the Reddit and Twitter benchmarks. Graph Anomaly…
Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: What is the trade-off between visual fidelity and inference efficiency when quantizing LLaVA-UHD with INT4/INT8 compared to FP16, as measured by SEED-Bench scores and memory footprint reduction. Principal…
Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: What is the impact of model scaling on quantization noise sensitivity for vision-language models during inference, measured through CLIP and ALIGN benchmark performance. Contrastive language-image pretraining…
Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the performance of Mul-GAD compare to other semi-supervised graph anomaly detection models on the Reddit and Twitter datasets in terms of precision, recall, and F1-score. Anomaly detection is defined as…
Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: How does the inference throughput of LLaVA-UHD compare to LLaVA-1.5-7B and LLaVA-1.5-13B when processing 4K images on MMBench, and how does this scalability impact latency per token in. Visual encoding constitutes…
Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: How do different LLaVA model versions compare in terms of quantization-aware training effectiveness on standard multimodal reasoning benchmarks like VQA and GQA. Recent advances in multimodal vision-language…
Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: To what extent does few-shot prompting enable Llama3 to match the RMSE of domain-specific transformers like Temporal Fusion Transformers on unseen renewable energy datasets. Short-term load forecasting (STLF) is…