PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
News
Publications
Resources
Contact
1
Off-Agent Trust Region Policy Optimization
Leveraging the experiences of other agents offers a powerful mechanism to enhance policy optimization in multi-agent reinforcement …
Ruiqing Chen
,
Xiaoyuan Zhang
,
Yali Du
,
Yifan Zhong
,
Zheng Tian
,
Fanglei Sun
,
Yaodong Yang
PDF
Cite
ProgressGym: Alignment with a Millennium of Moral Progress
Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such …
Tianyi Qiu
,
Yang Zhang
,
Xuchuan Huang
,
Jasmine Xinze Li
,
Jiaming Ji
,
Yaodong Yang
PDF
Cite
Code
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
Despite the recent successes of multi-agent reinforcement learning (MARL) algorithms, efficiently adapting to co-players in …
Yizhe Huang
,
Anji Liu
,
Fanqi Kong
,
Yaodong Yang
,
Song-Chun Zhu
,
Xue Feng
PDF
Cite
In-Context Editing: Learning Knowledge from Self-Induced Distributions
In scenarios where language models must incorporate new information efficiently without extensive retraining, traditional fine-tuning …
Siyuan Qi
,
Bangcheng Yang
,
Kailin Jiang
,
Xiaobo Wang
,
Jiaqi Li
,
Yifan Zhong
,
Yaodong Yang
,
Zilong Zheng
PDF
Cite
Code
Language models resist alignment: Evidence from data compression
Large language models (LLMs) may exhibit unintended or undesirable behaviors. Recent works have concentrated on aligning LLMs to …
Jiaming Ji
,
Kaile Wang
,
Tianyi Qiu
,
Boyuan Chen
,
Jiayi Zhou
,
Changye Li
,
Hantao Lou
,
Juntao Dai
,
Yunhuai Liu
,
Yaodong Yang
PDF
Cite
Code
DOI
Remember the Past for Better Future: Memory-Augmented Offline RL
As a foundation of human intelligence, memory has been found to be critical for human attention and decision making. However, it is …
Yue Zhang
,
Yaodong Yang
,
Zhenbo Lu
,
Wengang Zhou
,
Houqiang Li
Cite
DOI
Anyskill: Learning Open-Vocabulary Physical Skill for Interactive Agents
Traditional approaches in physics-based motion generation centered around imitation learning and reward shaping often struggle to adapt …
Jieming Cui
,
Tengyu Liu
,
Nian Liu
,
Yaodong Yang
,
Yixin Zhu
,
Siyuan Huang
PDF
Cite
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations
Neuro-symbolic reinforcement learning (NS-RL) has emerged as a promising paradigm for explainable decision-making, characterized by the …
Lirui Luo
,
Guoxi Zhang
,
Hongming Xu
,
Yaodong Yang
,
Cong Fang
,
Qing Li
PDF
Cite
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Offline-to-online Reinforcement Learning (O2O RL) aims to improve the performance of offline pretrained policy using only a few online …
Yinmin Zhang
,
Jie Liu
,
Chuming Li
,
Yazhe Niu
,
Yaodong Yang
,
Yu Liu
,
Wanli Ouyang
Cite
DOI
Grasp Multiple Objects with One Hand
The intricate kinematics of the human hand enable simultaneous grasping and manipulation of multiple objects, essential for tasks, such …
Yuyang Li
,
Bo Liu
,
Yiran Geng
,
Puhao Li
,
Yaodong Yang
,
Yixin Zhu
,
Tengyu Liu
,
Siyuan Huang
PDF
Cite
«
»
Cite
×