The father of AlphaGo explains the "Go God" refinement

2022-10-03 04:04:13

Electronic enthusiasts eight o'clock in the morning: Go has been circulating for nearly 3,000 years, but humans have been underestimating a point: the central area of â€‹â€‹the game represented by the fifth line.

This is the important information revealed by AlphaGo's father, DemisHassabis, founder of DeepMind, to share the story behind AlphaGo.

Since the game in Seoul in March last year, AlphaGo has surpassed the intrinsic thinking and routines of human players, and the impact on the world of chess is unprecedented. In the words of Demis Hassabis, "just like people use the Hubble telescope to discover new space. AlphaGo is the Hubble telescope in the world of chess."

On May 24th, DeepMind founder Demis Hassabis and AlphaGo team leader David Silver explained the development story behind AlphaGo and what does AlphaGo mean?

â€œAlphaGo has demonstrated creativity, and in a certain field it can even imitate human intuition.â€ Demis Hassabis told the First Financial Reporter that in the future, the tremendous power of human-computer cooperation can be seen, and human intelligence will be further enhanced through artificial intelligence. enlarge. "Strong artificial intelligence is the ultimate tool for human research and exploration of the universe."

Where is Go?

Historically, the first classic game that the computer mastered was the tic-tac-toe game, which was a research project for a Ph.D. student in 1952; followed by the computer program Chinook successfully challenged the checkers game in 1994; three years later, IBM Deep Blue Super The computer defeated world champion Gary Kasparov in the chess game.

In contrast, Go seems to be simple and complex, but unimaginable. It has a total of 10 170 possibilities, which is more than the 80th power of the entire universe. There is no way to exhaust all possible outcomes of Go.

In DemisHassabis's view, the more difficult thing is that Go is not calculated by games like chess, but by intuition. "There is no level concept in Go, all the pieces are the same. Go is a fortification game, so you need to calculate the future. In the process of playing chess, you are in the heart of the board and must predict the future. A small piece can shake the whole situation. Itâ€™s all over the body. Goâ€™s 'hands' are like apocalypse,â€ Hassabis explained.

Fan Yi, the first human professional player who played against AlphaGo, said to reporters, "I once thought that the computer defeated the professional chess player and would never see it in my life. I didn't expect it to be realized so quickly."

For the AlphaGo team, it's time to find a smarter way to unlock the Go puzzle.

The key to the AlphaGo system is to compress Go's huge search space into a controllable range.

In response to the enormous complexity of Go, AlphaGo uses a novel machine learning technique that combines the advantages of supervised learning and reinforcement learning.

Specifically, the first is to form a policy network by training, taking the situation on the board as input information, and generating a probability distribution for all feasible drop positions. Then, a value network is trained to predict the self-game, and the results of all feasible positions are predicted by the criteria of -1 (absolute victory of the opponent) to 1 (absolute victory of AlphaGo).

Both networks are very powerful in their own right, and AlphaGo integrates these two networks into a probability-based Monte Carlo Tree Search (MCTS), realizing its real advantages. Finally, the new version of AlphaGo generates a large number of self-games, providing training data for the next generation, and the process loops.

How AlphaGo decides to drop

After obtaining the game information, AlphaGo will explore which location has high potential value and high probability according to the strategy network, and then determine the optimal placement position.

At the end of the assigned search time, the location most frequently examined by the system during the simulation will be AlphaGo's final choice. AlphaGo's search algorithm adds an approximation of human intuition to its computing power after an initial exploration and process of the best.

Demis Hassabis said that AlphaGo is not just imitating the way other human players do, but is constantly innovating.

For example, in the second game with Li Shishi in the 37th step, this step is the most shocking step Demis has felt throughout the game.

Demis explained that there are two crucial dividing lines in Go, the third line from the right. If you move the piece on the third line, it means you will occupy the field to the right of the line. And if you are on the fourth line, it means that you plan to march into the middle of the board. Potentially, you will occupy the rest of the board in the future, which may be comparable to the area you got on the third line.

Therefore, in the past 3,000 years, it has been widely believed that the third line and the fourth line have the same importance. But in step 37, the Alpha dog dropped the piece on the fifth line and entered the central part of the game. â€œThis may mean that over the past few thousand years, people have underestimated the importance of the central region of the game.â€

It is worth mentioning that compared with Li Shishi's AlphaGo last year, DeepMind scientist David Silver said that AlphaGo is now stronger. He said: "AlphaGo playing against Li Shishi has 50 TPUs in the cloud, searching 50 The move is 10,000 positions/second, and the AlphaGoMaster, which defeated Ke Jie on May 23, played on a single TPU. AlphaGo became his own teacher. It learned from his own search and had a stronger strategy and value network. â€

On the Weibo on May 24th, Ke Jie also lamented the test report given by the AlphaGo team: I was playing chess with a terrible opponent.

"How big is this gap? Simply explain that it is a one-handed turn of Go, and the opponent will let you take three steps in a row... and like a martial arts master showdown, let you first slap three knives..." Ke Jie said.

What else can AlphaGo do besides playing Go?

In addition to Go, DemisHassabis told reporters that AlphaGo's efficient algorithm is a general-purpose algorithm that can be extended to other algorithms to apply artificial intelligence to a variety of fields, such as AI for material design and new drug development. There are also real-life applications such as medical, smart phones, and education.

However, he also admitted to the First Business that the technology behind AlphaGo includes image processing, big data analysis, etc. These technologies are still in the early stage of exploration in other fields, and are only applied in certain fields in the middle of AlphaGo research. But in the future, it will definitely promote relevant technologies in many fields.

Disclaimer: The electronic reprinted works of E-Commerce Network are as far as possible to indicate the source, and all rights of the owner of the work are not transferred due to the reprint of this site. If the author does not agree to reprint, please inform the site to delete or correct it. Reprinted works may be subject to change in title or content.

US Furniture Power Outlet

If there is a lot of installation in the home, it may be the socket. Electrical appliances require electrical outlets for power, and outlets are prone to failure. Now there are two styles on the market, divided into exposed and concealed. But it is estimated that many people do not know the difference between these two styles. Which one is more suitable for our home? You can consult us.

A03

furniture power strip,under desk power outlet,under desk power strip

Dongguan baiyou electronic co.,ltd , https://www.dgbaiyou.com