DSpace DSpace 日本語
 

AIT Associated Repository of Academic Resources >
A.研究報告 >
A1 愛知工業大学研究報告 >
4.愛知工業大学研究報告 (2008-) >
54号 >

Please use this identifier to cite or link to this item: http://hdl.handle.net/11133/3491

Title: 深層学習とプレイアウトに基づく囲碁アルゴリズム
Other Titles: シンソウ ガクシュウ ト プレイ アウト ニ モトズク イゴ アルゴリズム
Go Algorithm Based on Deep Learning and Playout
Authors: 伊藤, 雅
伊藤, 有人
ITOH, Masaru
ITO, Arito
Issue Date: 31-Mar-2019
Publisher: 愛知工業大学
Abstract: This paper describes a go algorithm based on deep learning and playout. The algorithm runs on a small resource environment which consists of one CPU and one GPU. The best next move can be obtained by using a Value-Monte-Carlo tree search method. It is one of the best-first search methods. The proposed method omits the process of tree policy which has been proposed by AlphaGo. Instead of tree policy, the method adds the top 20 candidates with the highest probability in synchronization with SL policy network as leaves of the node when expanding a leaf node. The win/loss function according to the rollout policy advocated by AlphaGo is substituted by playout, which is commonly used in ordinary Monte-Carlo tree search. As a node evaluation value, not an ordinary UCB1 value but an action value advocated by AlphaGo is adopted. Numerical experiments confirmed the statistical significance of the proposed method and clarified both the best mixing parameter value and the node expansion threshold.
URI: http://hdl.handle.net/11133/3491
Appears in Collections:54号

Files in This Item:

File Description SizeFormat
紀要54号(p110-p117).pdf491.29 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback