[2003.13590] Suphx: Mastering Mahjong with Deep Reinforcement Learningcontact arXivarXiv Twitter

Artificial Intelligence (AI) has achieved great success in many domains, and game AI is widely regarded as its beachhead since the dawn of AI. In recent years, studies on game AI have gradually evolved from relatively simple environments (e.g., perfect-information games such as Go, chess, shogi or two-player imperfect-information games such as heads-up Texas hold'em) to more complex ones (e.g., multi-player imperfect-information games such as multi-player Texas hold'em and StartCraft II). Mahjong is a popular multi-player imperfect-information game worldwide but very challenging for AI research due to its complex playing/scoring rules and rich hidden information. We design an AI for Mahjong, named Suphx, based on deep reinforcement learning with some newly introduced techniques including global reward prediction, oracle guiding, and run-time policy adaptation. Suphx has demonstrated stronger performance than most top human players in terms of stable rank and is rated above 99.99% of al

6 mentions: @sotetsuk@Hi_king@MikiBear_@Miles_Brundage@douchi_kamiya@MakotoHagiwar10
Date: 2020/03/31 02:20

Referring Tweets

@sotetsuk 丸2年間、開発に携わっていた麻雀AI「Suphx」のarXivが出ました。自分は分散強化学習部分の実装とチューニングをしていました。正直しんどかったです😢あとは唯一の日本人ということで、日本式麻雀のドメイン知識部分も担当していました🀄 t.co/zJ2LzGcf3Z
@Hi_king Microsoft Researchの麻雀AI suphxの論文 強化学習の工夫は * 人間のログから教師あり学習で初期値(SL) * 一局の点数ではなく、"最終局終了後の予測点数"の変化分をその局の評価値(RL-1) * 他家の牌が全部見えてる状態から徐々に情報を落とし、最後に不完全情報麻雀に(RL-2) t.co/ac3tkTYigD t.co/cmwQJNVpMQ
@Miles_Brundage "Suphx: Mastering Mahjong with Deep Reinforcement Learning," Li et al.: t.co/89oDQqeBB7 "Suphx has demonstrated stronger performance than most top human players in terms of stable rank + is rated above 99.99% of .. officially ranked human players in the Tenhou platform."
@MikiBear_ 알파사키 Suphx: Mastering Mahjong with Deep Reinforcement Learning t.co/FgImxySQep t.co/cmRhU7iAri

Related Entries

Read more [1812.11446] Greedy Layerwise Learning Can Scale to ImageNet
0 users, 2 mentions 2019/01/23 15:46
Read more Enhanced Convolutional Neural Tangent Kernels | OpenReview
0 users, 2 mentions 2019/09/27 18:48
Read more Learning in the Frequency Domain - arutema47's blog
0 users, 1 mentions 2020/03/06 09:51
Read more GitHub - karolpiczak/ESC-50: ESC-50: Dataset for Environmental Sound Classification
0 users, 1 mentions 2020/03/31 06:51
Read more Miki Rubinstein
1 users, 1 mentions 2020/04/25 08:21