1

A Unified Diversity Measure for Multiagent Reinforcement Learning

Promoting behavioural diversity is of critical importance in multi-agent reinforcement learning, since it helps the agent population …

Zongkai Liu, Chao Yu, Yaodong Yang, Peng Sun, Zifan Wu, Yuan Li

Constrained Update Projection Approach to Safe Policy Optimization

Safe reinforcement learning (RL) studies problems where an intelligent agent has to not only maximize reward but also avoid exploring …

Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan

MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control

We introduce the Multi-Agent Tracking Environment (MATE), a novel multi-agent environment simulates the target coverage control …

Xuehai Pan, Mickel Liu, Fangwei Zhong, Yaodong Yang, Song-Chun Zhu, Yizhou Wang

Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning

Setting up a well-designed reward function has been challenging for many reinforcement learning applications. Preference-based …

Runze Liu, Fengshuo Bai, Yali Du, Yaodong Yang

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning

Achieving human-level dexterity is an important open problem in robotics. However, tasks of dexterous hand manipulation even at the …

Yuanpei Chen, Tianhao Wu, Shengjie Wang, Xidong Feng, Jiechuang Jiang, Stephen Marcus McAleer, Hao Dong, Zongqing Lu, Song-Chun Zhu, Yaodong Yang

End-to-End Affordance Learning for Robotic Manipulation

Learning to manipulate 3D objects in an interactive environment has been a challenging problem in Reinforcement Learning (RL). In …

Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong

Debias the Black-Box: A Fair Ranking Framework via Knowledge Distillation

Deep neural networks can capture the intricate interaction history information between queries and documents, because of their many …

Zhitao Zhu, Shijing Si, Jianzong Wang, Yaodong Yang, Jing Xiao

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Large sequence models (SM) such as GPT series and BERT have displayed outstanding performance and generalization capabilities in …

Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang

On the Convergence of Fictitious Play: A Decomposition Approach

Fictitious play (FP) is one of the most fundamental game-theoretical learning frameworks for computing Nash equilibrium in n-player …

Yurong Chen, Xiaotie Deng, Chenchen Li, David Mguni, Jun Wang, Xiang Yan, Yaodong Yang

Neural Auto-Curricula in Two-Player Zero-Sum Games

When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, …

Xidong Feng, Oliver Slumbers, Ziyu Wan, Bo Liu, Stephen McAleer, Ying Wen, Jun Wang, Yaodong Yang