At the game developer conference CEDEC 2024, a session ``Past, Present, and Future of Shogi AI'' was held by Tatsuya Sugimura of Motoyawata Asahi Law Office and Urao Yaneu
of Yaneu Design
. Let's report on a session that talked about the past and future of shogi AI, which has grown rapidly and now surpasses even professional shogi players.
Mr. Sugimura, a lawyer, is the developer of the shogi AI
"Suisho" , which is also used by
Sota Fujii Seven Crowns
, and has won numerous Shogi AI world tournaments. As a developer, he may be better known as "Tayayan".
Mr. Urao Yane is the developer of
"Yaneurao"
, which has become the de facto standard for open source shogi AI. Many recent shogi AIs, including Suisho, use Yaneuraou in some form.
|
|
Mr. Tatsuya Sugimura (left) and Mr. Urao Yane (right)
|
|
The session progressed with a video featuring Masaki Wakaru, a character from the Yaneuraou official channel
Birth and evolution of shogi AI
First, the path from the birth of shogi AI to its victory over professional shogi players was introduced.
The world's first shogi AI is said to have been developed in 1974 by Takenobu Takizawa, who was a graduate student at the time and is currently vice president of the Computer Shogi Association and professor emeritus at Waseda University.
Early shogi AI was a combination of a simple
"evaluation function" and search algorithms, mainly
"Minimax method" .
An evaluation function is a ``evaluation value'' that digitizes the shogi situation (information on the board + moves + previous steps). In the simplest case, each piece on the player's side is +1, and each piece on the player's side is -1, and from there, large pieces (rooks and horns) and pieces that are doing well on the board are worth +1. Modifications are made to increase the value.
|
The Minimax method refers to the evaluation value output by the evaluation function and examines the evaluation of each route up to several moves ahead. However, since a brute force method is inefficient, it narrows down the evaluation situations. ``Search'' was born as an improvement on the Minimax method. Furthermore, various ``pruning'' ideas were introduced that could be used in conjunction with αβ search.
An example of pruning is a ``killer move'' that, when a move that would clog the opponent's king is found, is prioritized in evaluating the moves closest to it.
|
The evaluation function and search algorithm are equivalent to the ``two wheels of a car''
for Shogi AI.
In early shogi AI, the parameters of the evaluation function were adjusted by humans. In addition to the types of pieces and their functions on the board, we also set factors such as the solidity of the king's defense (gold and silver nearby, retreat routes prepared), and played against the standard software. Since the process required repeated fine-tuning, some people said that ``creating an evaluation function is a craftsman's skill.''
However, as it continued to evolve, evaluation functions became more complex and beyond the control of humans. At that time, ``Bonanza'', developed by Kunihito Hoki, appeared. Bonanza uses the ``Bonanza Method'' | , which ``searches for parameters such that the evaluation function used to judge each board is the same as the actual move made by a strong player,'' and automatically calculates parameters from human game records. It made it possible to make adjustments.
The Bonanza method is based on “optimal control theory”, but it was later discovered that parameters can also be adjusted using “stochastic gradient descent”, which is often used in machine learning
Shogi AI has continued to evolve since then, and in an official match in 2013, "ponanza" defeated professional shogi player Shinichi Sato, 4-dan . This ponanza also used αβ search and adjustment of evaluation function parameters using machine learning.
In addition, the average number of legal moves in Shogi (moves that can be made without violating the rules) is 80, so if you make a total hit, you will be able to evaluate 80 ways if you are one move ahead, and 80 x 80 (80 x 80) for 6400 ways if you are two moves ahead. Become. However, Bonanza's average number of branches (the number of evaluations per move) decreased from 3 in the early stages to around 5 in the late stages. The average number of branches of the latest AI is 2, and it is said that it can evaluate up to 30 moves ahead.
Current Shogi AI
Here, 10 events that have had a big impact from 2013, when shogi AI surpassed humans, to the present are highlighted.
●Reinforcement learning
Bonanza was learning from the game records of professional players, but there were only about 30,000 of them, which was far fewer than the number of parameters Bonanza had.
Additionally, since Shogi AI became stronger than professional Shogi players, the point of using the game records of professional Shogi players diminished, so developers started having Shogi AI learn the game records generated by playing games.
●Shogi AI tournament
In addition to the World Computer Shogi Championship, which has been held every year since 1990, Shogi Den-O Tournament (2013-2017), World Shogi AI Denryu Tournament (from 2021), etc. are now being held. The large prize money was a great motivation for the developers.
●Yaneuraou open source
Yaneuraou was published on GitHub in 2015 and became open source. While many shogi AIs have a structure in which the evaluation function and search section are integrated, Yaneuraoh was highly modular, so it was possible to replace the evaluation function or the search section. It has come to be used by many developers.
●Evolution of Stockfish
"Stockfish" is an open source chess AI with a large number of participants in the developer community, and one small improvement is said to be tested tens of thousands of times. Although the game is different, the exploration part has many applications that can be applied to shogi, and the evolution of Stockfish has also led to the evolution of shogi AI.
●NNUE evaluation function
``NNUE'' is an evaluation function that can perform high-speed difference calculations using only the CPU, and was introduced in 2018. It became the mainstream from then on, replacing the evaluation function called three-piece relationship used in Bonanza.
|
●Alpha Zero
"AlphaZero" is an AI for Go, Shogi, and Chess developed by Google DeepMind in 2017. It makes full use of deep learning, and is also characterized by the adoption of ``Monte Carlo tree search'' instead of the previously mainstream αβ method.
Shogi AIs created based on AlphaZero's paper include ``dlshogi'', ``AobaZero'', and ``Fukauraou''
, and in current shogi AI tournaments, both AlphaZero type and conventional type (αβ method) are used. He is active.
●nnue-pytorch
``nnue-pytorch''
realizes machine learning for NNUE using GPU, significantly reducing the time required for learning. This became the driving force for the author Hisjun Noda to win the World Computer Shogi Championship to be held in May 2024.
●How to create strong shogi software
“How to Create Strong Shogi Software”
is a book written by Tadao Yamaoka , the developer of dlshogi. It is written about the development of deep learning-based shogi AI, and it is said that with just the knowledge written here, it is possible to create a shogi AI that is stronger than a professional shogi player.
●Publishing high-quality teacher data
Mr. Yamaoka, who wrote about how to create strong shogi software, and Mr. Tayan Sugimura, who is speaking at this session, have released teacher data (data for learning). In reinforcement learning for shogi AI, the cost of creating training data is higher than the learning cost. Therefore, it seems that the barrier to entry has suddenly lowered with this disclosure.
Mr. Sugimura explained the reason: ``Even if you are the only one who has it, you may not be able to use it, so in that case, it's better to have someone else use it and say, ``I used the data.''''
| ●Rise of SNS
Around 2013, many shogi AI developers were university researchers, and many did not use SNS. Since the open source version of King Yaneura, the number of new people entering this field has increased, and a generational shift has progressed, and there is now a lot of interaction between developers on X and Discord.
History of Shogi AI enhancement seen through ratings
"Iro Rating" is used to express the strength of Shogi AI. This is an index originally devised to express chess skill, and it is also backed by mathematics.
According to Shogi Club 24, the official online shogi competition site of the Japan Shogi Federation, which is also used by professional players, the human limit is around 3000 to 3300, and for first-dan amateurs it is around 1000. However, Bonanza's rating in 2005 was It was 2360.
In 2009, when Bonanza vs. Mei Ryuo Watanabe, Ryuo Watanabe overcame a situation where he thought he might be defeated and won, but Bonanza's rating at that time was 2815. It can be seen that Ryuo Watanabe, whose strength is close to the human limit, was able to achieve this victory.
In 2013, "Gikou" was 3713, more than 400 points higher than the human limit of 3300. Apparently, a difference of 400 means that you can win with a probability of over 90%. And the winner of the 2024 World Computer Shogi Championship "Would you like to become a CSA member?" has a score of 4914, which is far beyond that of humans.
The important thing is that this is a rating based on a typical laptop PC that takes about 5 seconds to think about. Mr. Sugimura said that using something like a supercomputer, it would not be surprising if the number could reach around 7,000.
Shogi AI has evolved to this point and is used by a wide range of players, both professional and amateur. It is often used in ways such as having AI analyze the shogi you played and verify which move was bad, or having it analyze the expected situation in a game and consider the best move for that situation. That's right.
The future of shogi AI
As for the future, they talked about how to develop the world's strongest shogi AI.
Current shogi AI can be roughly divided into the conventional ``NNUE type'' that uses αβ search, and the ``DL type'' that uses full-scale deep learning.
And since the source code of both Yaneuraou and dlshogi, which are representative of each, has been released, there is a high possibility that the world's strongest shogi AI will be created by making one improvement. That's what he said.
So, what can be improved from here? Those are the following five.
●Improved evaluation function
Because the current NNUE type uses the CPU to perform calculations, there is a trade-off between the accuracy of the evaluation function and the number of scenarios that can be searched, making it extremely difficult to adjust. However, GPU calculations are said to be incompatible with αβ search.
On the other hand, it is known that ResNet, the evaluation function used in many DL types, can be strengthened by introducing the attention mechanism of the transformer used in language models such as ChatGPT, and it is possible to make use of knowledge from the machine learning field. That's what they say.
| ●Adjustment of teacher data
NNUE-type shogi AI searches more than 100 million positions per second on a tournament-spec machine, but the accuracy of position evaluation is not very high, so it is said to be relatively stronger in the final stages than in the early stages. Therefore, when learning NNUE-type shogi AI, there seems to be a tendency for it to be better to concentrate on the early stages.
On the other hand, there is also the idea that since the early stages, up to about the 32nd move, are often progressed in the fixed way (the best way of moving based on past research), there is no problem in omitting learning at that point.
Also, since swinging rooks are not considered an effective tactic in the current tournament, there seems to be a way to omit them.
| ●Automatic generation of fixed marks
Because there are limits to manually editing the fixed marks, top teams are trying to automatically generate them. However, in order to create highly accurate chess moves, the shogi AI needs to run for a long time in one game, so this is not very efficient either.
It seems that people who are familiar with graph theory and game tree search may be able to generate a large number of trails.
| ●Improvement of the search section
The NNUE type is based on the search section of the chess AI Stockfish, but in the same way, it is possible that it could be strengthened by bringing search ideas that have been successful in other AIs to the shogi AI. It is said that there is.
●Secure computing resources
Simply put, computing resources are computers. In recent years, it has become increasingly difficult for individuals to secure the computers needed to create teacher data, and the need for sponsors from major companies has increased. It seems that if you can create a large amount of training data, you could become the world's strongest.
Shogi AI has evolved since it was made open source by Yaneurao, with developers coming up with ideas. Even now, it seems that there is a chance to become the world's strongest by just making changes to one of the five improvements introduced this time, rather than all of them.
The content of the session is above, but at the end there were questions from the audience. The question is, ``I think there is a surefire way to win in shogi, which has no element of luck, but will we ever reach that point?''
Mr. Sugimura says that Shogi is a "two-player zero-sum finite definite perfect information game" , and while there is a surefire way to win or a guaranteed draw, there are so many options that it is difficult to reach the goal. The answer was that it would be difficult, and that even if a winning method was determined, there would be no way to save it as data.
|
|
The above is the detailed content of How did Shogi AI surpass professional Shogi players? And where do you go from here? [CEDEC 2024]. For more information, please follow other related articles on the PHP Chinese website!