PAIR Lab: PKU Alignment and Interaction Research Lab
PAIR Lab: PKU Alignment and Interaction Research Lab
Open-Source Projects
People
News
Publications
Resources
Contact
Nash Equilibrium
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
Self-play methods have demonstrated remarkable success in enhancing model capabilities across various domains. In the context of …
Mingzhi Wang
,
Chengdong Ma
,
Qizhi Chen
,
Linjian Meng
,
Yang Han
,
Jiancong Xiao
,
Zhaowei Zhan
,
Jing Huo
,
Weijie J Su
,
Yaodong Yang
PDF
Cite
Is Nash Equilibrium Approximator Learnable?
In this paper, we investigate the learnability of the function approximator that approximates Nash equilibrium (NE) for games generated …
Zhijian Duan
,
Wenhan Huang
,
Dinghuai Zhang
,
Yali Du
,
Jun Wang
,
Yaodong Yang
,
Xiaotie Deng
PDF
Cite
Cite
×