1

ProAgent: Building Proactive Cooperative Agents with Large Language Models

Building agents with adaptive behavior in cooperative tasks stands as a paramount goal in the realm of multi-agent systems. Current …

Ceyao Zhang, Kaijie Yang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhan, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, Yaodong Yang

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning

Centralized Training with Decentralized Execution (CTDE) has been proven to be an effective paradigm in cooperative multi-agent …

Sirui Chen, Zhaowei Zhang, Yaodong Yang, Yali Du

Maximum Entropy Heterogeneous-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) has been shown effective for cooperative games in recent years. However, existing …

Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, QIANG FU, Xiaojun Chang, Yaodong Yang

BeaverTails: A Human-Preference Dataset for LLM Harmlessness Alignment

In this paper, we introduce the BeaverTails dataset, aimed at fostering research on safety alignment in large language models (LLMs). …

Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Boyuan Chen, Ruiyang Sun, Yizhou Wang, Yaodong Yang

Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Artificial intelligence (AI) systems possess significant potential to drive societal progress. However, their deployment often faces …

Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, Yaodong Yang

Unidexgrasp++: Improving Dexterous Grasping Policy Learning via Geometry-Aware Curriculum and Iterative Generalist-Specialist Learning

We propose a novel, object-agnostic method for learning a universal policy for dexterous object grasping from realistic point cloud …

Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning …

Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai

A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems

In order for agents in multi-agent systems (MAS) to be safe, they need to take into account the risks posed by the actions of other …

Oliver Slumbers, David Henry Mguni, Stephen Marcus McAleer, Stefano B. Blumberg, Jun Wang, Yaodong Yang

Regret-Minimizing Double Oracle for Extensive-Form Games

By incorporating regret minimization, double oracle methods have demonstrated rapid convergence to Nash Equilibrium (NE) in normal-form …

Xiaohang Tang, Le Cong Dinh, Stephen Marcus McAleer, Yaodong Yang