I don't think it can be hugely random though, both games 2 and 4 had very very similar openings, the first 10 or 11 moves if I recall correctly. AlphaGo only changed her responses in game 4 after Lee Sedol changed one of his. Maybe the pattern matching just naturally led to it?
Obviously moves that AlphaGo really likes are much more likely to come out of the tree search, but not guaranteed. If you had AlphaGo play the first move of a very large number of Go games, (billions? hundreds of billions?) It would probably play those moves in most of them, but not all of them. If you have it start enough games, you will be able to find at least one game for every possible starting position. I think that since Go is on a square, that even in 19x19 Go there are effectively only 45 or 46 possibilities for the first move?
I don't think it can be hugely random though, both games 2 and 4 had very very similar openings, the first 10 or 11 moves if I recall correctly. AlphaGo only changed her responses in game 4 after Lee Sedol changed one of his. Maybe the pattern matching just naturally led to it?
Well, the random rollouts only contribute 50% of its search evaluations; the other 50% comes from its value network which will give the same results every time. Moreover, the relative contribution of the randomness diminishes as more and more simulations are run.
As such, I guess it's possible that biases in the value network could win out over the random component and cause AlphaGo to behave pretty much deterministically in some situations. However, as Scott suggests, it's probably more likely that for most "good" moves in any given situation there is at least some slight chance AlphaGo will play them.
An AI wrote a novel that made it past the first rounds of a literary competition.
I object a little to this one - It didn't so much write the novel, as assemble pre-made blocks within certain parameters to get a result. A nitpicky detail, sure, but I'd call it co-written at best. Maybe edited, not sure.
It didn't so much write the novel, as assemble pre-made blocks within certain parameters to get a result. A nitpicky detail, sure, but I'd call it co-written at best. Maybe edited, not sure.
Reverse nitpick: this is just writing, with a different atom size.
Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without any prior knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a competitive strategy that approached the performance of human experts and state-of-the-art methods.
Comments
As such, I guess it's possible that biases in the value network could win out over the random component and cause AlphaGo to behave pretty much deterministically in some situations. However, as Scott suggests, it's probably more likely that for most "good" moves in any given situation there is at least some slight chance AlphaGo will play them.
AI learns from twitter.
AI is a racist gomergater Trump supporter now.
http://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist
http://www.digitaltrends.com/cool-tech/japanese-ai-writes-novel-passes-first-round-nationanl-literary-prize/
I object a little to this one - It didn't so much write the novel, as assemble pre-made blocks within certain parameters to get a result. A nitpicky detail, sure, but I'd call it co-written at best. Maybe edited, not sure.
http://arxiv.org/abs/1603.01121