Thesis Details
Playing Gomoku with Neural Networks
This thesis explores the usage of AlphaZero algorithm for the game of Gomoku. AlphaZero is a reinforcement learning algorithm, which does not require any existing datasets and is able to improve only by using self-play. It uses a tree search for policy improvement, which is subsequently used for training. This approach was able to defeat the previous state of the art methods. Generating training data of high quality requires a lot of computationally expensive iterations, which makes them algorithm slow to train. Experiments show that the strength of the play is growing with each subsequent iteration, this might indicate that it still has room for improvement with more training and that it has not reached its full potential.
neural networks, Monte Carlo tree search, AlphaZero, backpropagation, reinforcement learning
Bidlo Michal, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Čadík Martin, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Křivka Zbyněk, Ing., Ph.D. (DIFS FIT BUT), člen
Rogalewicz Adam, doc. Mgr., Ph.D. (DITS FIT BUT), člen
@bachelorsthesis{FITBT21764, author = "Michal Sl\'{a}vka", type = "Bachelor's thesis", title = "Playing Gomoku with Neural Networks", school = "Brno University of Technology, Faculty of Information Technology", year = 2019, location = "Brno, CZ", language = "english", url = "https://www.fit.vut.cz/study/thesis/21764/" }