Liyuan Liu

Senior Researcher @ MSR

Hi there!

Welcome to Liyuan Lucas Liu (刘力源)‘s webpage! I am a Senior Researcher at Microsoft Research. My Ph.D. advisor is Prof. Jiawei Han, and my undergraduate advisor is Prof. Linli Xu. My research is about to understand the underlying mechanism of pretraining heuristics.

If you are going to visit Redmond, please let me buy you a bubble tea.

Interests

Pretraining Heuristics
Training Stability & Dynamics
Structures in Deep Learning

Education

Ph.D. in Computer Science, 2024

University of Illinois at Urbana-Champaign
B.Eng. in Computer Science, 2016

University of Science and Technology of China

Things I do

... and want to do

The success of large-scale pretraining hinges on intricate engineering heuristics. While the empirical benefits of these heuristics are evident, their underlying mechanisms remain elusive. My research endeavors to demystify the mathematical principles underlying these pretraining heuristics, aiming to illuminate their mechanisms and potentially guide future algorithm developments.

Fun Facts

Received more than 3,000 GitHub stars in total! It is ranked 2,041 among all github users (according to Gitstar).
- Although I doubt the Gitstar ranking is incomplete and outdated , it is still nice to have such an encouragement .
Torch-Scope has been downloaded by more than 32,000 times.
- Although this number cannot reflect the actual user number , there must be someone other than myself using this package .
Won the topcoder Arabic NER challenge.
- Since I know nothing about Arabic, my model has surely surpassed me on that . On second thought, why I am happy for being worse than PC…
Love skiing (can do black run, but still a rookie); DJI fans (proudly own Mavic Pro, Mavic Mini & Spark); love to watch Texas Hold’em (but seldom play); play Sheng Ji (双扣) and Mafia (狼人杀) with families & friends.

Selected Publications

List of all publications >>

Suyu Ge, Yunan Zhang, Liyuan Liu , Minjia Zhang, Jiawei Han, and Jianfeng Gao (2023). Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs. Proceedings of the Twelfth International Conference on Learning Representations (ICLR 2024). Selected as Oral.

PDF

Liyuan Liu , Chengyu Dong, Xiaodong Liu, Bin Yu, and Jianfeng Gao (2023). Bridging Discrete and Backpropagation: Straight-Through and Beyond. Proceedings of the Proceeding of Thirty-seventh Annual Conference on Neural Information Processing Systems (NeurIPS 2023). Selected as Oral.

PDF

Chengyu Dong, Liyuan Liu , and Jingbo Shang (2022). Label Noise in Adversarial Training: A Novel Perspective to Study Robust Overfitting. Proceedings of the Thirty-sixth Annual Conference on Neural Information Processing Systems (NeurIPS 2022). Selected as Oral.

PDF

Liyuan Liu , Xiaodong Liu, Jianfeng Gao, Weizhu Chen, and Jiawei Han (2020). Understanding the Difficulty of Training Transformers. the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). Selected as Oral.

PDF Code Slide

Liyuan Liu , Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han (2020). On the Variance of the Adaptive Learning Rate and Beyond. the Eighth International Conference on Learning Representations (ICLR 2020).

PDF Code

Liyuan Liu , Xiang Ren, Jingbo Shang, Jian Peng, and Jiawei Han (2018). Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling. the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018). Selected as Oral.

PDF Code Blog Doc

Liyuan Liu , Jingbo Shang, Xiang Ren, Frank Fangzheng Xu, Huan Gui, Jian Peng, and Jiawei Han (2018). Empower Sequence Labeling with Task-Aware Neural Language Model. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018). Selected as Oral.

PDF Code (new) Code (old) Blog