Liyuan Liu

🛠️ @ Thinking Machines Lab

Hi there!

Welcome to Liyuan Lucas Liu (刘力源)’s webpage! I am a Member of Technical Staff at the Thinking Machines Lab. My Ph.D. advisor is Prof. Jiawei Han, and my undergraduate advisor is Prof. Linli Xu. My research is about to understand the underlying mechanism of pretraining heuristics.

If you are going to visit Redmond, please let me buy you an ice cream. Molly Moon and Salt and Straw are pretty good .

Experience

Member of Technical Staff

TML, 2025 → Present
Principal Researcher

MSR, 2022 → 2025

Education

Ph.D. in Computer Science

University of Illinois at Urbana-Champaign
B.Eng. in Computer Science

University of Science and Technology of China

Things I do

... and want to do

I mainly work on understanding the mechanisms behind common heuristics in machine learning (aka tricks). Some interesting ones:

Learner-Sampler Mismatch
- modern framework implicitly brings RL to be off-policy & TIS may help, and can be accelerated via quantization.
Straight-Through Estimator
- The Straight-Through estimator works as a first-order approximation to the policy gradient and can be modified to achieve second-order accuracy.
Transformer Initialization
- Transformer training stability can be analyzed from the perspective of output perturbation. A related, better read is also available.
LR Warmup
- Learning rate warmup is helpful for stabilizing adaptive optimizers, even for non-Transformer models.

Fun Facts about Me

Love skiing & met my amazing wife during a ski trip; DJI fans; play Sheng Ji (双扣), Ark Nova, and Mafia (狼人杀) with families & friends.

Selected Publications

List of all publications >>

Suyu Ge*, Yunan Zhang*, Liyuan Liu* , Minjia Zhang, Jiawei Han, and Jianfeng Gao (2023). Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs. Proceedings of the Twelfth International Conference on Learning Representations (ICLR 2024). Outstanding Paper Honorable Mention.

PDF

Liyuan Liu , Chengyu Dong, Xiaodong Liu, Bin Yu, and Jianfeng Gao (2023). Bridging Discrete and Backpropagation: Straight-Through and Beyond. Proceedings of the Proceeding of Thirty-seventh Annual Conference on Neural Information Processing Systems (NeurIPS 2023). Selected as Oral.

PDF

Chengyu Dong, Liyuan Liu , and Jingbo Shang (2022). Label Noise in Adversarial Training: A Novel Perspective to Study Robust Overfitting. Proceedings of the Thirty-sixth Annual Conference on Neural Information Processing Systems (NeurIPS 2022). Selected as Oral.

PDF

Liyuan Liu , Xiaodong Liu, Jianfeng Gao, Weizhu Chen, and Jiawei Han (2020). Understanding the Difficulty of Training Transformers. the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). Selected as Oral.

PDF Code Slide

Liyuan Liu , Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han (2020). On the Variance of the Adaptive Learning Rate and Beyond. the Eighth International Conference on Learning Representations (ICLR 2020).

PDF Code

Liyuan Liu , Xiang Ren, Jingbo Shang, Jian Peng, and Jiawei Han (2018). Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling. the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018). Selected as Oral.

PDF Code Blog Doc

Liyuan Liu , Jingbo Shang, Xiang Ren, Frank Fangzheng Xu, Huan Gui, Jian Peng, and Jiawei Han (2018). Empower Sequence Labeling with Task-Aware Neural Language Model. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018).

PDF Code (new) Code (old) Blog

Highlighted Honors

List of all honors »

Outstanding Paper Honorable Mention

ICLR2024 2024

Shortlisted as 1 of 16 papers among 4922 submissions.

Winner of the Topcoder Arabic NER Challenge

Topcoder 2019

Ranked 1st among 137 registrants and 220 submissions.

Guo Moruo Scholarship

USTC 2015

Highest honor for USTC undergraduate students.

Google Excellent Scholarship

Google 2015

Only 58 graduate and undergraduate students shortlisted nationwide.