PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
News
Publications
Resources
Contact
2
Can Large Language Models Independently Complete Tasks? A Dynamic Evaluation Framework for Multi-turn Task Planning and Completion
Large language models (LLMs) are increasingly relied upon for multi-turn dialogue to conduct complex tasks. However, existing …
Jun Gao
,
Junlin Cui
,
Huijia Wu
,
Liuyu Xian
,
Han Zhao
,
Xiangang Li
,
Meng Fang
,
Yaodong Yang
,
Zhaofeng He
Cite
ReDMan: Reliable Dexterous Manipulation with Safe Reinforcement Learning
Dexterous hand manipulation is a crucial ability for robots in various applications. However, ensuring safety and reliability during …
Yiran Geng
,
Jiaming Ji
,
Yuanpei Chen
,
Haoran Geng
,
Fangwei Zhong
,
Yaodong Yang
PDF
Cite
Med-Aligner Empowers LLM Medical Applications for complex medical scenarios
Large language models (LLMs) show great promise in medical applications, but challenges like limited high-quality data, closed-source …
Xiangbin Meng
,
Jiaming Ji
,
Xiangyu Yan
,
Juntao Dai
,
Boyuan Chen
,
Guan Wang
,
Hua Xu
,
Jingjia Wang
,
Xuliang Wang
,
Da Liu
,
Mingqi Zheng
,
Rongzhou Wu
,
Chuanjie Wu
,
Yuwei Wu
,
Wenyao Wan
,
Zhen Song
,
Yaodong Yang
PDF
Cite
Large Language Models in Medicine: Applications, Challenges, and Future Directions
In recent years, large language models (LLMs) represented by GPT-4 have developed rapidly and performed well in various natural …
Erlan Yu,
,
Xuehong Chu
,
Wanwan Zhang
,
Xiangbin Meng
,
Yaodong Yang
,
Xunming Ji
,
Chuanjie Wu
Cite
TIMAR: Transition-Informed Representation for Sample-Efficient Multi-agent Reinforcement Learning
In MARL (Multi-Agent Reinforcement Learning), the trial-and-error learning paradigm based on multiple agents requires massive …
Mingxiao Feng
,
Yaodong Yang
,
Wengang Zhou
,
Houqiang Li
Cite
Discrete Information Acquisition in Financial Markets
We study investors’ information acquisition strategies under arbitrary and discrete sets of information precision and derive conditions …
Jingrui Pan
,
Shancun Liu
,
Qiang Zhang
,
Yaodong Yang
Cite
DOI
Transforming the Synthesis of Carbon Nanotubes with Machine Learning Models and Automation
Carbon-based nanomaterials (CBNs) hold immense promise in electronics, energy, and mechanics. However, their practical applications …
Yue Li
,
Shurui Wang
,
Zhou Lv
,
Zhaoji Wang
,
Yunbiao Zhao
,
Ying Xie
,
Yang Xu
,
Yaodong Yang
,
Et Al
PDF
Cite
DSR: Reinforcement Learning with Dynamical Skill Refinement
Skill-based reinforcement learning (Skill-based RL) is an efficient paradigm for solving sparse-reward tasks by extracting skills from …
Dongxiang Chen
,
Yaodong Yang
,
Ying Wen
PDF
Cite
Adaptive Pessimism via Target Q-Value for Offline Reinforcement Learning
Offline reinforcement learning (RL) methods learn from datasets without further environment interaction, facing errors due to …
Jie Liu
,
Yinmin Zhang
,
Chuming Li
,
Yaodong Yang
,
Yu Liu
,
Wanli Ouyang
Cite
Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management
We apply heterogeneous-agent proximal policy optimization (HAPPO), a multi-agent deep reinforcement learning (MADRL) algorithm, to the …
Xiaotian Liu
,
Ming Hu
,
Yijie Peng
,
Yaodong Yang
PDF
Cite
»
Cite
×