Dang Nguyen

dang2.jpg

Hi, I’m a CS Ph.D. candidate at UCLA under the supervision of Professor Baharan Mirzasoleiman. MMy research focuses on data-centric methods for building efficient and reliable agentic AI systems based on large (vision-)language models. I develop techniques for synthetic data generation and data selection to construct high-quality training signals under limited or noisy data. More recently, I have explored reasoning and decision-making, including test-time scaling, RL training, uncertainty-aware inference, with the goal of enabling reliable multi-step agent behavior in high-stakes settings..

Before joining UCLA, I was an AI Resident at VinAI (now Qualcomm AI). Prior to that, I received my BS degree, summa cum laude, from Toyo University. Going further back in time, I was a graduate of High School for Gifted Students (Hanoi University of Science) and a Maths Olympian (IMO 2015 Silver).

news

Apr 30, 2026 Our paper Data Selection for Fine-tuning Vision Language Models via Cross Modal Alignment Trajectories is accepted to ICML 2026.
Jan 26, 2026 Our paper Do We Need All the Synthetic Data? Targeted Image Augmentation via Diffusion Models is accepted to ICLR 2026.
Jun 23, 2025 I have officially advanced to Ph.D. candidacy! Looking forward to the next stage of my research journey.
May 15, 2025 Our paper Beyond Semantic Entropy: Boosting LLM Uncertainty Quantification with Pairwise Semantic Similarity is accepted to ACL Findings 2025.
May 01, 2025 Our paper Synthetic Text Generation for Training Large Language Models via Gradient Matching is accepted to ICML 2025.