PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
News
Publications
Resources
Contact
Reinforcement Learning
ReDMan: Reliable Dexterous Manipulation with Safe Reinforcement Learning
Dexterous hand manipulation is a crucial ability for robots in various applications. However, ensuring safety and reliability during …
Yiran Geng
,
Jiaming Ji
,
Yuanpei Chen
,
Haoran Geng
,
Fangwei Zhong
,
Yaodong Yang
PDF
Cite
Remember the Past for Better Future: Memory-Augmented Offline RL
As a foundation of human intelligence, memory has been found to be critical for human attention and decision making. However, it is …
Yue Zhang
,
Yaodong Yang
,
Zhenbo Lu
,
Wengang Zhou
,
Houqiang Li
Cite
DOI
Grasp Multiple Objects with One Hand
The intricate kinematics of the human hand enable simultaneous grasping and manipulation of multiple objects, essential for tasks, such …
Yuyang Li
,
Bo Liu
,
Yiran Geng
,
Puhao Li
,
Yaodong Yang
,
Yixin Zhu
,
Tengyu Liu
,
Siyuan Huang
PDF
Cite
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
The generalization of decision-making agents encompasses two fundamental elements: learning from past experiences and reasoning in …
Siyuan Qi
,
Shuo Chen
,
Yexin Li
,
Xiangyu Kong
,
Junqi Wang
,
Bangcheng Yang
,
Pring Wong
,
Yifan Zhong
,
Xiaoyuan Zhang
,
Zhaowei Zhan
,
Nian Liu
,
Wei Wang
,
Yaodong Yang
,
Song-Chun Zhu
PDF
Cite
Code
Poster
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning …
Hanjing Wang
,
Man-Kit Sit
,
Congjie He
,
Ying Wen
,
Weinan Zhang
,
Jun Wang
,
Yaodong Yang
,
Luo Mai
PDF
Cite
A Deep Reinforcement Learning-driven Vine Copula Method for Correlation Structure Analysis of Mortgage
Controlling risk is the key to playing a core role in financial services and effectively serving the high-quality development of the …
Qinghao Wang
,
Yanling PENG
,
Yijie Peng
,
Yaodong Yang
PDF
Cite
Learning to Shape Rewards using a Game of Two Partners
Reward shaping (RS) is a powerful method in reinforcement learning (RL) for overcoming the problem of sparse or uninformative rewards. …
David Mguni
,
Taher Jafferjee
,
Jianhong Wang
,
Nicolas Perez Nieves
,
Tianpei Yang
,
Matthew Taylor
,
Wenbin Song
,
Feifei Tong
,
Hui Chen
,
Jiangcheng Zhu
,
Jun Wang
,
Yaodong Yang
PDF
Cite
Quality-Similar Diversity via Population Based Reinforcement Learning
Diversity is a growing research topic in Reinforcement Learning (RL). Previous research on diversity has mainly focused on promoting …
Shuang Wu
,
Jian Yao
,
Haobo Fu
,
Ye Tian
,
Chao Qian
,
Yaodong Yang
,
QIANG FU
,
Yang Wei
PDF
Cite
Solving Inventory Management Problems through Deep Reinforcement Learning
Inventory management (e.g. lost sales) is a central problem in supply chain management. Lost sales inventory systems with lead times …
Qinghao Wang
,
Yijie Peng
,
Yaodong Yang
PDF
Cite
MSRL: Distributed Reinforcement Learning with Dataflow Fragments
Reinforcement learning (RL) trains many agents, which is resource-intensive and must scale to large GPU clusters. Different RL training …
Huanzhou Zhu
,
Bo Zhao
,
Gang Chen
,
Weifeng Chen
,
Yijie Chen
,
Liang Shi
,
Yaodong Yang
,
Peter Pietzuch
,
Lei Chen
PDF
Cite
»
Cite
×