Dang Nguyen

Hi, I’m a CS Ph.D. student at UCLA under the supervision of Professor Baharan Mirzasoleiman. My research focuses on improving data quality to enhance the performance and efficiency of large (vision-)language models. Specifically, I work on synthetic data generation and data selection to optimize training, making these models more effective and accessible.
Before joining UCLA, I was an AI Resident at VinAI. Prior to that, I received my BS degree, summa cum laude, from Toyo University. Going further back in time, I was a graduate of High School for Gifted Students (Hanoi University of Science) and a Maths Olympian (silver medal, IMO 2015).
news
Jan 22, 2025 | Our paper Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures is accepted to ICLR 2025. |
---|---|
Sep 25, 2024 | Our paper Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization is accepted to NeurIPS 2024. |
Sep 23, 2024 | I join Google Research as a Student Researcher. |
Jun 17, 2024 | I join Cisco as a PhD research intern. |
Mar 03, 2024 | Our paper Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift is accepted to DMLR @ ICLR 2024. |