Papers
Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does the integration of W.A.L.T's causal encoder design with Flamingo's visual tokenizer impact inference latency and downstream video captioning performance on ActivityNet when compared to. Video description…
Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: What is the quantitative trade-off between NDCG@10 recommendation accuracy and RLHF alignment scores when jointly modeling short-term and long-term user preferences using instruction-tuned LLMs. Abstract The…
Abstract: This report synthesises findings from 4 peer-reviewed papers addressing the following research question: To what extent does scaling the number of Indonesian video-text training samples in MSVD-Indonesian affect the zero-shot cross-lingual transfer performance of Flamingo on non-Indonesian video. Multimodal learning…
Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: What metrics (e.g., BLEU, CIDEr, METEOR) demonstrate the robustness of Indonesian video-text models like MSVD-Indonesian when fine-tuned with PaLI versus Flamingo on MSRVTT, and how does this compare. While…
Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the degradation in out-of-distribution robustness for video encoders pretrained on synthetic datasets when evaluated on diverse human motion benchmarks. Deep convolutional neural networks have performed…
Abstract: This report synthesises findings from 13 peer-reviewed papers addressing the following research question: What is the inference efficiency latency trade-off between Spatio-Temporal Graph Convolutional Networks and modern graph diffusion models for real-time traffic forecasting. Long-term traffic prediction is highly…
Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: What is the trade-off between inference latency and code generation accuracy when applying 4-bit quantization to transformer-based models on the MBPP dataset. Abstract The rapid evolution of large language models…
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How do graph diffusion models scale in parameter count and prediction accuracy compared to STGCN when applied to large-scale multimodal traffic datasets. Timely accurate traffic forecast is crucial for urban…
Abstract: This report synthesises findings from 9 peer-reviewed papers addressing the following research question: To what extent does Mul-GAD's semi-supervised approach improve anomaly detection accuracy over fully unsupervised GNN models like OCSVM-GNN on cross-domain datasets such as Amazon and DBLP, using. Machine learning…
Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does the inference latency of GADT3 compare to traditional GCN-based models under varying degrees of adversarial graph structure perturbations, measured using the OGB-LSC traffic prediction. Cyberattacks…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How does the adversarial robustness of graph diffusion models compare to STGCN under targeted node feature perturbations measured by AUC-ROC on traffic datasets. Traffic forecasting plays a critical role in…
Abstract: This report synthesises findings from 5 peer-reviewed papers addressing the following research question: To what extent can multimodal knowledge distillation from code-text pairs improve the robustness of small language models in code generation tasks, as measured by pass@k and latency metrics on. Recently, ChatGPT,…
Abstract: This report synthesises findings from 12 peer-reviewed papers addressing the following research question: How does the trade-off between model size and inference efficiency vary when distilling code generation capabilities from large language models to smaller models, as measured by latency and pass@k. Abstract The…
Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: What is the impact of different GNN architectures (e.g., GCN, GAT, GraphSAGE) on the cross-domain generalization capability of GADT3 in graph anomaly detection tasks, as measured by accuracy and. In order to use…
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: How robust is the Mul-GAD framework to adversarial attacks on graph structures, and how does its robustness compare to other test-time training frameworks in terms of anomaly detection accuracy and. Machine…
Abstract: This report synthesises findings from 3 peer-reviewed papers addressing the following research question: How does INT4 quantization of LLaVA-UHD affect its performance on SEED-Bench compared to FP16 precision across different visual reasoning subtasks. Abstract In the past years, multimodal large language models…
Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: What is the impact of quantization-aware training on the inference latency and memory requirements of LLaVA-UHD when deployed on edge devices. Large foundation models, including large language models (LLMs),…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: How robust is Mul-GAD's performance against adversarial attacks on graph structures compared to models like GAS and GCN-AE, as measured by anomaly detection accuracy on perturbed versions of the. Anomaly detection…
Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: What is the impact of feature dimensionality reduction on GADT3's cross-domain anomaly detection performance on the ACM and DBLP graph benchmarks. Deep convolutional neural networks have performed remarkably well…
Abstract: This report synthesises findings from 6 peer-reviewed papers addressing the following research question: To what extent does domain adaptation in CLIP-TD improve cross-domain robustness compared to standard CLIP, as measured by accuracy on ImageNet-to-Sketchy and ImageNet-to-ClipArt domain adaptation. Multi-Task…
Abstract: This report synthesises findings from 7 peer-reviewed papers addressing the following research question: How does GADT3's homophily-guided self-supervision approach scale to billion-parameter LLMs on the Reddit and Twitter perturbed graph datasets. In the last few years, the deep learning (DL) computing paradigm has…
Abstract: This report synthesises findings from 8 peer-reviewed papers addressing the following research question: How does GADT3's test-time training framework compare to supervised GAD baselines in detecting anomalies on the Amazon and Yelp datasets when 20\% of node features are randomly masked. Cyber-attacks are becoming…
Abstract: This report synthesises findings from 15 peer-reviewed papers addressing the following research question: What is the impact of model distillation techniques on inference efficiency in CLIP-based vision-language models, measured by throughput and accuracy trade-offs on Flickr30k and MSCOCO benchmarks. Abstract The…
Abstract: This report synthesises findings from 14 peer-reviewed papers addressing the following research question: How does the performance of CLIP-TD compare to ALIGN in low-shot settings when evaluated on VQA and COCO text-to-image retrieval benchmarks. Natural Language Processing (NLP) is one of the most captivating…
Abstract: This report synthesises findings from 10 peer-reviewed papers addressing the following research question: How does the performance of Mul-GAD scale with increasing graph size and sparsity compared to other GNN-based semi-supervised anomaly detection models like GANomaly and DeepSVM when evaluated on. With a long…