AlphaGo is a computer program that plays Go board games. It was developed by Google DeepMind from Alphabet Inc. in London.
In October 2015, AlphaGo became the first Go computer program that beat professional Go professional players flawlessly on the full 19ÃÆ' 19 board. In March 2016, he beat Lee Sedol in a five-match game, the first time the Go beat 9-and-defective professional computer program. Despite losing to Lee Sedol in the fourth game, Lee retired in the final game, giving the final score of 4 games to 1 in favor of AlphaGo. In recognition of victory, AlphaGo was awarded the 9th honorific title and by the Korea Baduk Association. Leading and challenging matches with Lee Sedol are documented in the documentary entitled AlphaGo , directed by Greg Kohs. This is selected by Science as one of Runner of the Year on December 22, 2016.
In the 2017 Future Go Summit, AlphaGo beat Ke Jie, the world No.1 ranked player at the time, in a three-game match. After this, AlphaGo was awarded a professional 9-and by the Weiqi China Association. After the match between AlphaGo and Ke Jie, AlphaGo retired while DeepMind continued AI research in other areas.
AlphaGo uses the Monte Carlo tree search algorithm to find its movement based on the knowledge previously "learned" by machine learning, especially by artificial neural networks (extensive learning methods) with extensive training, both from human games and computers.
Video AlphaGo
History
Go is considered much more difficult for a computer to win than any other game like chess, because much larger branching factors make it very difficult to use traditional AI methods such as alpha-beta pruning, traversal search and tree heuristic search.
Nearly two decades after IBM Deep Blue computer beat world chess champ Garry Kasparov in the 1997 game, Go's strongest program using artificial intelligence techniques only reached around level 5-and amateurs, and still can not beat a professional Go player without defects. In 2012, the Zen software program, which runs on four PC clusters, defeated Masaki Takemiya (9p) twice on five and four stone hurdles. In 2013, Crazy Stone defeated Yoshio Ishida (9p) with a four stone handicap.
According to David Silver of DeepMind, the AlphaGo research project was formed around 2014 to test how well neural networks that use inner learning can compete on the Go. AlphaGo represents a significant improvement over the previous Go program. In 500 games against other Go programs available, including Crazy Stone and Zen, AlphaGo runs on one computer winning all but one. In a similar battle, AlphaGo runs on multiple computers winning all 500 games played against other Go programs, and 77% of games played against AlphaGo run on one computer. The distributed version in October 2015 uses 1,202 CPUs and 176 GPUs.
Match against Fan Hui
In October 2015, the distributed version of AlphaGo beat the Go European champion Fan Hui, a professional 2-and (out of 9 and possibly), five to zero. This is the first time a Go computer program beats a professional human player on a full-sized, flawless board. The news announcement was postponed until January 27, 2016 to coincide with the publication of a paper in the journal Nature describing the algorithm used.
Match against Lee Sedol
AlphaGo plays a professional Go South Korean player Lee Sedol, ranked 9th-and, one of the best players at Go, with five matches taking place at the Four Seasons Hotel in Seoul, South Korea at 9, 10, 12, 13 and 15 March 2016, which broadcasted video-streaming. Aja Huang, a member of the DeepMind team and an amateur 6-and Go player, puts a rock on the Go board for AlphaGo, which runs through Google's cloud computing with its servers located in the United States. The match uses Chinese rule with 7.5 points, and each side has two hours of thinking time plus three 60-second byoyomi periods. The AlphaGo version that plays against Lee uses the same amount of computing power as used in Fan Hui games. The Economist reported that they were using a 1920 CPU and 280 GPUs.
At the time of play, Lee Sedol has the second highest number of international Go winners in the world. Although there is no official ranking method at Go international, some sources rate Lee Sedol as the fourth best player in the world at the time. AlphaGo is not specially trained to face Lee.
The first three games were won by AlphaGo after Lee's resignation. However, Lee defeated AlphaGo in the fourth game, winning with resignation while moving 180. AlphaGo then went on to achieve a fourth win, winning the fifth game with his resignation.
The prize is US $ 1 million. Because AlphaGo won four out of five and thus the series, prizes will be donated to charities, including UNICEF. Lee Sedol received $ 150,000 to participate in all five games and an additional $ 20,000 for his victory.
In June 2016, at a presentation held at the university in the Netherlands, Aja Huang, one of the Deep Mind teams, revealed that it had rectified the problems that occurred during the 4th match match between AlphaGo and Lee, and after moving 78 (nicknamed "the movement divine "by many professionals), it will play accurately and retain the benefits of Black. Prior to moving 78, AlphaGo took the lead throughout the game and Lee's move was not credited as one that won the game, but caused the computing power of the program to be diverted and confused. Huang explained that AlphaGo's policy network to find the most accurate sequence of movements and continuations did not precisely guide AlphaGo to make the right continuation after moving 78, because its network of values ââdid not determine Lee's 78th move as the most likely, and therefore when the move was made AlphaGo can not make the right adjustments for logical continuation.
Sixty online games
On December 29, 2016, a new account on the Tygem server named "Magister" (shown as 'Magist' on the Chinese server version) from South Korea started playing games with professional players. This changed the name of his account to "Master" on December 30, then moved to FoxGo server on January 1, 2017. On January 4, DeepMind confirmed that both "Magister" and "Master" were played by the latest version of AlphaGo. On January 5, 2017, AlphaGo's online record was 60 wins and 0 losses, including three victories over top-ranked Go player Ke Jie, who was secretly told earlier that Master is an AlphaGo version. After losing to Master, Gu Li offered a reward of 100,000 yuan (US $ 14,400) to the first human player to beat Master. Master plays at a rate of 10 matches per day. Many quickly assumed it became an AI player because there was little or no break in between games. Its enemies include many world champions such as Ke Jie, Park Jeong-hwan, Yuta Iyama, Tuo Jiaxi, Mi Yuting, Shi Yue, Chen Yaoye, Li Qincheng, Gu Li, Chang Hao, Tang Weixing, Fan Tingyu, Zhou Ruiyang, Jiang Weijie, Chou Chun-hsun, Kim Ji-seok, Kang Dong-yun, Park Yeong-hun, and Won Seong-jin; national champions or world championship runners like Lian Xiao, Tan Xiao, Meng Tailing, Dang Yifei, Huang Yunsong, Yang Dingxin, Gu Zihao, Shin Jinseo, Cho Han-seung and An Sungjoon. All 60 games except one is a fast paced game with three 20 or 30 seconds byo-yomi. Master offers to extend the byo-yomi up to a minute while playing with Nie Weiping taking into account her age. After winning the 59th game, Master reveals herself in the chat room to be controlled by Dr. Aja Huang from the DeepMind team, then changed his citizenship to England. After these games are over, Google DeepMind co-founder Demis Hassabis says in a tweet, "We look forward to playing some official games, complete then [2017] working with Go and expert organizations."
The Go experts are impressed by AlphaGo's performance and non-human playing style; To Jie states that "After mankind spent thousands of years to fix our tactics, computers tell us that humans are completely wrong... I would say so far no human has touched the edge of truth Go."
Future Go Summit
In the Future Go Summit held in Wuzhen in May 2017, AlphaGo played three matches with Ke Jie, the world No.1 ranked player, as well as two games with some of China's top professionals, one Go game and one against a team that collaborated on five human players.
Google DeepMind offered a $ 1.5 million winner prize for a three-match match between Ke Jie and AlphaGo, while the losers spent 300,000 dollars. AlphaGo won all three games against Ke Jie. AlphaGo was awarded a professional 9-and by the Weiqi China Association.
After winning a three-game match against Ke Jie, the world's top-class Go player, AlphaGo retires. DeepMind also dissolves teams working in the game to focus on AI research in other areas. After the summit, Deepmind publishes 50 full length AlphaGo vs AlphaGo matches, as a reward for the Go community.
AlphaGo Zero and AlphaZero
The AlphaGo team published an article in the journal Nature on October 19, 2017, introducing AlphaGo Zero, a version without human data and more powerful than the previous human-champion-beating versions. By playing games against himself, AlphaGo Zero surpassed the power of AlphaGo Lee in three days by winning 100 games to 0, reaching AlphaGo Master level in 21 days, and surpassing all older versions in 40 days.
In a paper released in arXiv on December 5, 2017, DeepMind claims that it generalizes AlphaGo Zero's approach into a single AlphaZero algorithm, which is achieved in 24 hours of super-human game levels in chess, shogi, and Go games by beating the program world champion, Stockfish , Elmo, and a 3-day version of AlphaGo Zero in each case.
Educator
On December 11, 2017, DeepMind released the AlphaGo teaching tool on its website to analyze the winning rate of different Go openings as counted by AlphaGo Master. The teaching tool collects 6,000 openings from 230,000 human games each analyzed with 10,000,000 simulations by AlphaGo Master. Many of the openings include suggestions of human movement.
Maps AlphaGo
Version
Initial versions of AlphaGo were tested on hardware with varying amounts of CPU and GPU, running in asynchronous or distributed mode. Two seconds of thinking time is given for each movement. The resulting Elo ratings are listed below. In a game with more time per movement, a higher ranking is achieved.
In May 2016, Google unveiled its own "tensor processing unit" hardware, which was claimed to have been applied in several internal projects on Google, including an AlphaGo match against Lee Sedol.
In the Future Go Summit in May 2017, DeepMind revealed that the AlphaGo version used in this Summit is AlphaGo Master, and reveals that it measures the power of different versions of the software. AlphaGo Lee, a version used against Lee, can give AlphaGo Fan, a version used in AlphaGo vs. Fan Hui, three stone, and AlphaGo Master are three stronger stones.
Algorithm
By 2016, AlphaGo algorithms use a combination of machine learning and tree search techniques, combined with extensive training, both from human games and computers. It uses Monte Carlo tree search, guided by "value network" and "network policy", both implemented using inner neural network technology. A limited number of special features of pre-processing detection (for example, to highlight whether a movement corresponds to a nakade pattern) is applied to the input before it is sent to the neural network.
The neural network of this system originally bootstrap from the expertise of human gameplay. AlphaGo was originally trained to mimic human games by trying to match the movement of expert players from recorded history games, using a database of about 30 million movements. After reaching a certain level of skill, he is further trained with arranged to play a large number of games against another example of himself, using reinforcement learning to improve his game. To avoid "disrespect" wasting the time of his opponent, the program is specifically programmed to resign if the probability of winning judgments falls below a certain threshold; for the match against Lee, the threshold for resignation is set to 20%.
Play style
Toby Manning, match referee for AlphaGo vs. Fan Hui, described the style of the program as "conservative". AlphaGo's play style is very lucrative to win bigger wins with fewer points than the possibility of winning lower with more points. His strategy for maximizing his probability of winning is different from what human players tend to do to maximize territorial gains, and explain some of his strange-looking moves. It makes a lot of opening movements that never or rarely made by humans, while avoiding the many second row opening moves that a human player wants to do. He likes to use shoulder blows, especially if his opponent is too concentrated.
Response to victory 2016
AI Community
The AlphaGA victory in March 2016 is a major milestone in artificial intelligence research. The previous go has been considered a difficult problem in machine learning that is expected to be unattainable at that time of technology. Most experts think the Go program is as strong as AlphaGo for at least five years; some experts think that it will need at least another decade before the computer beats the Go champ. Most observers early in the 2016 game expect Lee to beat AlphaGo.
With games like chess (which have been "solved" by the Chinook player draft team), chess, and now Go won by computers, the victory in popular board games no longer serves as a major milestone for artificial intelligence in the way they use to. Murray Campbell of Deep Blue called AlphaGoo's victory as "the end of an era... board games more or less done and it's time to move on."
When compared to Deep Blue or with Watson, AlphaGo's underlying algorithm is potentially more general-purpose, and may be evidence that the scientific community is making progress toward artificial common intelligence. Some commentators believe AlphaGo's victory makes it a good opportunity for people to start discussing preparations for possible future machine impacts with general-purpose intelligence. (As noted by businessman Guy Suter, AlphaGo himself only knows how to play Go, and lacks general purpose intelligence: "[It] can not wake up one morning and decide to learn how to use firearms") In March 2016, researcher AI Stuart Russell states that "the AI ââmethod develops much faster than expected, (which) makes the question of long-term outcomes more urgent," adding that "to ensure that the stronger AI system remains completely under human control... there is a lot of work "Some scholars, such as Stephen Hawking, warned (in May 2015 before the game) that some AI improvements in the future can gain real common intelligence, leading to an unexpected takeover of AI; other scholars disagree: AI expert Jean-Gabriel Ganascia believes that "Things like 'common sense'... may never be reproduced", and say "I do not see why we are talking about fear, but it raises hope. in many domains like health and space exploration. "Computer scientist Richard Sutton said," I do not think people should be afraid... but I think people should pay attention. "
In China, AlphaGo is a "Sputnik moment" that helps convince the Chinese government to prioritize and dramatically increase funding for artificial intelligence.
In 2017, the AlphaGo DeepMind team received the first Marvin Minsky IJCAI medal for Extraordinary Achievement at AI. "AlphaGo is a remarkable achievement, and a perfect example of what the Minsky Medal begins to recognize", says Professor Michael Wooldridge, Chairman of the IJCAI Award Committee. "What really impressed IJCAI is that AlphaGo achieved what it does through a brilliant combination of classic AI techniques and state-of-the-art machine learning techniques that are closely related to DeepMind.This is a stunning demonstration of contemporary AI, and we are delighted to recognize it this award. "
Go to community
Go is a popular game in China, Japan and Korea, and the 2016 game has been watched by about a hundred million people worldwide. Many Go top players are characterized by orthodox AlphaOgo playing as a seemingly questionable move that initially confused the audience, but it makes sense in the back: "All but the best Go players craft their style by imitating top players." AlphaGo seems to have a genuinely genuine move that creates itself AlphaGo seems to be unexpectedly much stronger, even when compared to October 2015 where the computer beat a Go professional for the first time without the benefit of a handicap. The day after Lee's first defeat, Jeong Ahram, Go's correspondent for one of South Korea's biggest newspapers, said, "Last night was very grim... Many people drink alcohol." The Baduk Association of Korea, the organization that oversees Go professional in South Korea, gives AlphaGo an honors degree 9-and to showcase creative skills and drive game progress.
Chinese player Ke Jie, an 18-year-old player who is generally recognized as the best Go player in the world at the time, initially claimed that he would be able to defeat AlphaGo but refused to play against him for fear that it would "imitate my style". As the game progresses, Ke Jie back and forth, stating that "it is very possible I (could) lose" after analyzing the first three games but regaining confidence after AlphaGo showed a flaw in the fourth game.
Toby Manning, referee of AlphaGo matches against Fan Hui, and Hajin Lee, general secretary of the International Go Federation, both argue that in the future, Go players will get help from computers to learn what errors they have done in the game and improve their skills.
After the second match, Lee said he felt "can not say anything": "From the beginning of the game, I can never overcome the wind for one movement - that is AlphaGo's total victory." Lee apologized for his loss, stating after game three that "I misjudged AlphaGo's ability and felt helpless." He stressed that the defeat was "the defeat of Lee Se-dol" and "not the defeat of mankind". Lee said his loss to the machine was "inevitable" but stated that "robots will never understand the beauty of the game in the same way as humans do." Lee called his game four wins a "priceless victory that I (would) not exchange for anything."
Similar systems
Facebook has also worked on the Go-playing system itself darkforest , also based on combining machine learning and Monte Carlo tree search. Despite strong players against other Go computer programs, in early 2016, it has not defeated professional human players. Darkforest has lost to CrazyStone and Zen and is estimated to have powers similar to CrazyStone and Zen.
DeepZenGo, a system developed with support from video-sharing sites Dwango and Tokyo University, lost 2-1 in November 2016 to Go master Cho Chikun, who holds the record for Japan's biggest Go winning number.
A 2018 paper in Nature cites the AlphaGo approach as the basis for a new tool in calculating potential pharmaceutical drug molecules.
Example game
AlphaGo Master (white) v. Tang Weixing (December 31, 2016), AlphaGo won for resigning. White 36 is widely praised.
Impact on Go
The AlphaGo documentary raises hopes that Lee Sedol and Fan Hui will benefit from their AlphaGo experience, but by May 2018 their rankings have changed little; Lee Sedol was ranked 11th in the world, and Fan Hui 545. But the Go community as a whole may have moved forward in how to play the game.
See also
- Chinook (drill player), draft playback program
- List of artificial intelligence
- Go and math
- TD-Gammon, backgammon nerve network
References
External links
- Official website
- AlphaGo wiki in the Sensei Library, including links to AlphaGo games
- AlphaGo page, with archives and games
Source of the article : Wikipedia