Publications

(*) denotes equal contribution

2026

  1. Why is A+B Better Than B? A Simple Graph Perspective on Task Transfer
    Dang Nguyen*, Jianhao Huang*, Ali Payani, and 1 more author
    1st Workshop on Foundations of Deep Generative Models (FoGen) at ICML 2026, 2026
  2. Data Selection for Fine-tuning Vision Language Models via Cross Modal Alignment Trajectories
    Nilay Naharas*, Dang Nguyen*, Neslihan Bulut, and 3 more authors
    3rd Workshop on Navigating and Addressing Data Problems for Foundation Models at ICLR 2026
    International Conference on Machine Learning (ICML), 2026
  3. Do We Need All the Synthetic Data? Towards Targeted Synthetic Image Augmentation via Diffusion Models
    Dang Nguyen*, Jiping Li*, Jinghao Zheng*, and 1 more author
    arXiv preprint arXiv:2505.21574, 2026

2025

  1. Beyond Semantic Entropy: Boosting LLM Uncertainty Quantification with Pairwise Semantic Similarity
    Dang Nguyen, Ali Payani, and Baharan Mirzasoleiman
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025
  2. Synthetic Text Generation for Training Large Language Models via Gradient Matching
    Dang Nguyen*, Zeman Li*, Mohammadhossein Bateni, and 3 more authors
    International Conference on Machine Learning (ICML), 2025
  3. Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures
    Dang Nguyen, Wenhan Yang, Rathul Anand, and 2 more authors
    International Conference on Learning Representations (ICLR), 2025

2024

  1. Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization
    Dang Nguyen, Paymon Haddad, Eric Gan, and 1 more author
    Advances in Neural Information Processing Systems, 2024
  2. Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift
    Yihao Xue, Siddharth Joshi, Dang Nguyen, and 1 more author
    Data-centric Machine Learning Research (DMLR) Workshop at ICLR 2024
    International Conference on Learning Representations (ICLR), 2024

2023

  1. Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction
    Khai Nguyen*, Dang Nguyen*, and Nhat Ho
    International Conference on Machine Learning (ICML), 2023
  2. On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks
    Dang Nguyen, Trang Nguyen, Khai Nguyen, and 3 more authors
    Top 3%
    IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

2022

  1. Improving Mini-batch Optimal Transport via Partial Transportation
    Khai Nguyen*, Dang Nguyen*, The-Anh Vu-Le, and 2 more authors
    International Conference on Machine Learning (ICML), 2022
  2. On Transportation of Mini-batches: A Hierarchical Approach
    Khai Nguyen, Dang Nguyen, Quoc Nguyen, and 5 more authors
    International Conference on Machine Learning (ICML), 2022