Yiming Yang
Professor, Language Technologies Institute
Machine Learning
- 6717 Gates & Hillman Centers
- 412-268-1364
Yiming Yang is a Professor at Carnegie Mellon University’s Language Technologies Institute (LTI) and Machine Learning Department. Her research focuses on advancing machine learning (ML), natural language processing (NLP), and AI for Science (AI4Science), with expertise in enhancing large language models (LLMs) through advanced training, self-alignment, self-refinement, and inference-time reasoning.
Dr. Yang has made significant contributions to transformer-based LLMs, neural network architecture optimization, knowledge-enhanced graph neural networks, extreme multi-label text classification, and more. Her work is widely recognized in top conferences such as ICML, NeurIPS, and ICLR, with influential publications including:
- "XLNet: Generalized Autoregressive Pretraining for Language Understanding"
- "DARTS: Differentiable Architecture Search"
- "Principle-Driven Self-Alignment of Language Models"
- "Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision"
- "DIFUSCO: Graph-based Diffusion Solvers for Combinatorial Optimization"
Dr. Yang earned her Ph.D. in Computer Science from Kyoto University and joined Carnegie Mellon University in 1986. She has held academic and research leadership positions, including work in medical informatics at the Mayo Clinic and major research initiatives funded by NSF, DARPA, and IARPA. Her groundbreaking contributions have earned her numerous accolades, including Best Paper Awards at KDIR, SIGKDD, and IJCAI, as well as induction into the ACM SIGIR Academy in 2023.
With over 73,000 citations and an h-index of 81 (Google Scholar, January 2025), Dr. Yang remains a leading figure in machine learning and information retrieval. At CMU, she teaches "Machine Learning with Graphs" and actively mentors graduate students and postdoctoral researchers, guiding them to prestigious achievements such as the Google Fellowship. She also serves as an area chair for top ML conferences (NeurIPS, ICLR, ICML) and plays an integral role in CMU’s admissions and hiring committees.
Dr. Yang’s research continues to bridge the gap between cutting-edge AI advancements and real-world applications, driving innovation in both theoretical and applied machine learning.
Research Statement
My research has centered on statistical learning methods/algorithms and their impactful applications.
Most Recent Projects/Advances:
- Enhancing Large Language Models (LLMs) via self-alignment (NeurIPS 2023, NearIPS 2024, ICLR 2024), self-correct (EMNLP 2024), self-play (ICLR 2025), Inference Scaling Laws and compute-optimal Inference (ICLR 2025) and Transformer enhancements (AISTATS 2024, ICLR 2024)
- LLM-based Mathematical Reasoning with Easy-to-hard Generalization (NearIPS 2024), Step-by-Step Reasoning via Twisted Sequential Monte Carlo (ICLR 2025), Learning to Interleave Thinking and Proving (ICLR 2025)
- Multi-modal Alignment across text, images and video (ACL 2024, NAACL 2025)
- ML-based Combinatorial Optimization & AI for Code Generation
o DIMES (Differentiable Meta Solver) (NeurIPS 2022), a based non-autoregressive. RL model that enhanced the scalability of RL-based CO solvers from 100-node graphs to 10000-nodes graphs with SOTA performance on benchmarks at the time.
o DIFUSCO (NeurIPS 2023), the 1st Graph-based Diffusion Solver for NP-hard problems, which outperformed DIMES in both scalability and accuracy.
o Neural Solvers for PDE (Partial Differential Equations) and Molecular Design (ICML 2023, ICLR 2025)