About me
I am currently a final-year Ph.D. student at Tsinghua Shenzhen International Graduate School, Tsinghua University supervised by Prof. Xiu Li. I am now working as a group leader of Reinforcement Learning Group of Intelligent Computing Lab (ICLAB). I receive my bachelor’s degree from the Department of Engineering Physics, Tsinghua University in 2020.
I am fortunate to work closely with Prof. Zongqing Lu from Peking University. I am now an intern student at Game AI Research Center, Tencent IEG (Interactive Entertainment Group), supervised by Mr. Le Wan and Mr. Jingwen Yang. Before that, I was an intern student at Pengcheng Lab (PCL) supervised by Prof. Zongqing Lu.
My research interests lie in efficient decision-making with deep Reinforcement Learning, including offline RL, sample-efficient general online model-free RL, and model-based RL. Meanwhile, I am interested in deploying RL algorithms in real-world applications, such as robotics, large language models, etc.
Please feel free to drop me an e-mail if you are interested in collaborating with me. lvjf20[AT]mails.tsinghua.edu.cn
Publications
Preprints
- A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning.
Shengjie Sun, Runze Liu, Jiafei Lyu, Jingwen Yang, Liangpeng Zhang, Xiu Li - SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning.
Zhongjian Qiao, Jiafei Lyu, Kechen Jiao, Qi Liu, Xiu Li. - World Models with Hints of Large Language Models for Goal Achieving.
Zeyuan Liu, Ziyu Huan, Xiyao Wang, Jiafei Lyu, Jian Tao, Xiu Li, Furong Huang, Huazhe Xu. - Bias-reduced Multi-step Hindsight Experience Replay for Efficient Multi-goal Reinforcement Learning.
Rui Yang, Jiafei Lyu, Yu Yang, Jiangpeng Yan, Feng Luo, Dijun Luo, Lanqing Li, Xiu Li.
Conference Papers
- ODRL: A Benchmark for Off-Dynamics Reinforcement Learning.
Jiafei Lyu, Kang Xu, Jiacheng Xu, Mengbei Yan, Jingwen Yang, Zongzhang Zhang, Chenjia Bai, Zongqing Lu, Xiu Li.
Advances in Neural Information Processing Systems (NeurIPS) (Dataset and Benchmark Track), 2024. - Mind the Model, Not the Agent: The Primacy Bias in Model-based RL.
Zhongjian Qiao, Jiafei Lyu, Xiu Li.
European Conference on Artificial Intelligence (ECAI), 2024. - Cross-Domain Policy Adaptation by Capturing Representation Mismatch.
Jiafei Lyu, Chenjia Bai, Jingwen Yang, Xiu Li, Zongqing Lu.
International Conference on Machine Learning (ICML), 2024. - Exploration and Anti-Exploration with Distributional Random Network Distillation.
Kai Yang, Jian Tao, Jiafei Lyu, Xiu Li.
International Conference on Machine Learning (ICML), 2024. - PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation.
Runze Liu, Yali Du, Fengshuo Bai, Jiafei Lyu, Xiu Li.
International Conference on Machine Learning (ICML), 2024. - Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model.
Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li.
IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024 - SEABO: A Simple Search-Based Method for Offline Imitation Learning.
Jiafei Lyu, Xiaoteng Ma, Le Wan, Runze Liu, Xiu Li, Zongqing Lu.
International Conference on Learning Representations (ICLR), 2024. - Towards Understanding How to Reduce Generalization Gap in Visual Reinforcement Learning.
Jiafei Lyu, Le Wan, Xiu Li and Zongqing Lu.
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024 (Extended Abstract). - Normalization Enhances Generalization in Visual Reinforcement Learning.
Lu Li*, Jiafei Lyu*, Guozheng Ma, Zilin Wang, Zhenjie Yang, Xiu Li, Zhiheng Li.
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024. (Oral). Generalization in Planning Workshop at NeurIPS, 2023. - Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement Learning.
Junjie Zhang*, Jiafei Lyu*, Xiaoteng Ma, Jiangpeng Yan, Jun Yang, Le Wan, Xiu Li.
European Conference on Artificial Intelligence (ECAI), 2023. (Oral) - PRAG: Periodic Regularized Action Gradient for Efficient Continuous Control.
Xihui Li, Zhongjian Qiao, Aicheng Gong, Jiafei Lyu, Chenghui Yu, Jiangpeng Yan, Xiu Li.
Pacific Rim International Conference on Artificial Intelligence (PRICAI), 2022. - Mildly Conservative Q-learning for Offline Reinforcement Learning.
Jiafei Lyu*, Xiaoteng Ma*, Xiu Li, Zongqing Lu.
Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight) - Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination.
Jiafei Lyu, Xiu Li, Zongqing Lu.
Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight) - Efficient Continuous Control with Double Actors and Regularized Critics.
Jiafei Lyu*,Xiaoteng Ma*, Jiangpeng Yan, Xiu Li.
AAAI Conference on Artificial Intelligence, (AAAI), 2022. (Oral)
Journal Papers
- Enhancing Visual Reinforcement Learning with State-Action Representation.
Mengbei Yan*, Jiafei Lyu*, Xiu Li.
Knowledge-Based Systems, 2024 (IF=7.1) - Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence.
Jiafei Lyu, Le Wan, Xiu Li and Zongqing Lu.
Journal of Artificial Intelligence Research, 2024 (IF=4.5) - A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation.
Aicheng Gong, Kai Yang, Jiafei Lyu, Xiu Li.
Engineering Applications of Artificial Intelligence, 2024 (IF=7.5) - Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse.
Jiafei Lyu, Le Wan, Zongqing Lu, Xiu Li.
Information Sciences, 2024. (IF=8.1) - Value Activation for Bias Alleviation: Generalized-activated Deep Double Deterministic Policy Gradients.
Jiafei Lyu*, Yu Yang*, Jiangpeng Yan, Xiu Li.
Neurocomputing, 2023. (IF=6.1)
Workshop Papers
- Zero-shot Preference Learning for Offline RL via Optimal Transport.
Runze Liu, Yali Du, Fengshuo Bai, Jiafei Lyu, Xiu Li.
Optimal Transport and Machine Learning Workshop at NeurIPS, 2023. - State Advantage Weighting for Offline RL.
Jiafei Lyu, Aicheng Gong, Le Wan, Zongqing Lu, Xiu Li.
ICLR 2023 tiny paper, 3rd Offline RL Workshop: Offline RL as a ‘‘Launchpad’’ at NeurIPS, 2022.
Collaborators
- Xiu Li: Professor, Tsinghua Shenzhen International Graduate School, Tsinghua University
- Zongqing Lu: BOYA Associate Professor, School of Computer Science, Peking University
- Xiaoteng Ma: Postdoc student, Department of Automation, Tsinghua University
- Jiangpeng Yan: Top minds, Huawei (Ph.D. alumni of Department of Automation, Tsinghua University)
- Rui Yang: Ph.D. Student at University of Illinois Urbana-Champaign。
Honors
- 2018.10 Academic Excellence Award of Tsinghua University
- 2021.10 Outstanding Scholarship of Tsinghua University
- 2022.10 Outstanding Scholarship of Tsinghua University
- 2022.08 Top 10\% Reviewer for ICML 2022
- 2023.07 Recognition Award of 2022 Tencent Rhino-Bird Research Elite Program
- 2023.10 Outstanding Scholarship of Tsinghua University
- 2024.06 Best Reviewer Award of ICML 2024
- 2024.10 National Scholarship of Tsinghua University
Educations
- 2020 - present, Ph.D., Tsinghua Shenzhen International Graduate School, Tsinghua University
- 2017 - 2020, Minor in Statistics, Center for Statistical Science, Tsinghua University
- 2016 - 2020, Bachelor, Department of Engineering Physics, Tsinghua University
Internships
Tencent IEG, Game AI Research Center (2023.10 - present)
Research on offline RL, transfer RL
Tencent IEG, Game AI Research Center (2022.06 - 2023.10)
Researched on offline RL, sample-efficient online RL, and offline2online RL
Pengcheng Lab (2021.10 - 2022.04)
Researched on offline RL
Teaching
- Machine Learning by Prof. Xuegong Zhang, Autumn 2020. (Teaching Assistant)
- Frontier of AI Technology and Industrial Application by Prof. Xiu Li, Autumn 2021. (Teaching Assistant)
Services
- Conference Reviewer: ICML (2022, 2023, 2024), NeurIPS (2022, 2023, 2024), AAAI (2022, 2023, 2024, 2025), ECAI (2023), ICLR (2024, 2025), RLC (2024), AAMAS (2025), AISTATS (2025)
- Journal Reviewer: TMLR, TAI, CAAI, TNNLS, RA-L