This forum is in permanent archive mode. Our new active community can be found here.

AlphaGo

13

Comments

  • edited March 2016
    Andrew said:

    What's holding us back is that all these very smart people don't really, truly understand deep, multilayer convolution networks. Sure, we are really good at constructing the scaffolding for this machine that can perform very specific tasks given enough (and the right) training data. However, we don't know anything about the underlying structure of this machine once it's been trained. It's essentially a black box.

    Correct. That's a major reason why I'm not particularly fond of deep convolutional neural networks; I tend to prefer things like explicit Bayesian models where the model is directly accessible and not hidden away in a black box. That said, I'm currently applying to work at DeepMind, so it's not like I'm unwilling to work with them.

    Overall, I find it a little disappointing that AlphaGo is all it took to beat the human world champion at Go. I would have hoped that the task would have required significant theoretical advancements in both deep learning and search---all AlphaGo does is present a novel way of combining the two.
    Post edited by lackofcheese on
  • *fistbump for Bayesian bro*
  • Andrew said:

    *fistbump for Bayesian bro*

  • edited March 2016
    Apreche said:

    Here's what's even more interesting. Does AlphaGo "think" when it's not it's turn? Or does it just idle until a move is made by the opponent? If the human opponent takes longer on their turn, and AlphaGo is thinking during that time, that is really bad for the human. It can explore an extraordinary number of branches of play in the time it takes a human to explore one. If it thinks during it's opponent's turn, the way to beat it might be to play very fast and loose, and to play with a very short clock. 5 second turns or something.

    Yes, according to the Nature paper AlphaGo does indeed think during the opponent's turn.

    However, as is often the case with search algorithms, searching during the opponent's turn is significantly less efficient than searching after it, because you have to split your search effort between different possibilities for the opponent's next move.
    Apreche said:

    The other question is whether or not AlphaGo ever takes a shortcut. Does it always fully and completely calculate its best move, or does it ever estimate a best move from available choices due to time constraints? If it ever does the latter, then allowing it to process during the human's turn will allow it to make even better decisions.

    In fact, AlphaGo *never* fully and completely calculates its best move (except perhaps at the very end of the game); every single move AlphaGo makes is an estimate limited by time constraints.
    Post edited by lackofcheese on
  • Rym said:

    I suspect that AlphaGo's early game is the key. It's already doing things that appear to be more effective there than what humans have uncovered in over a thousand years.

    zehaeva said:

    Early game is the key, if you don't set up good formations at the start you'll find yourself behind in ways that endgame simply can not make up for. This is worse at the Pro levels, in their end game the path is clear and fairly impossible to make mistakes. The only really open points in end game are during ko fights and that's more because the point estimation of ko can be a bit fuzzy.

    I haven't looked at the details of these games yet, but this would make sense to me.

    Incidentally, it's notable that the very early game is a major contrast between AlphaGo and computer chess engines, which mostly use preset "opening books" for their first few moves. AlphaGo, on the other hand, doesn't treat the opening any differently from the rest of the game.
  • Incidentally, it's notable that the very early game is a major contrast between AlphaGo and computer chess engines, which mostly use preset "opening books" for their first few moves. AlphaGo, on the other hand, doesn't treat the opening any differently from the rest of the game.

    Some of the biggest surprises we've seen from AlphaGo is in the opening and early midgame, she's played moves that are simply thought as fairly insane (the tennuki in game 2 around move 11 I'm thinking of specifically) amateur mistakes and yet they worked brilliantly! I seriously think that just these few games will change a bit of the current opening theory in go.
  • I agree with your concerns about "black box" AI. But what I'm utterly fascinated by here is that a "black box" is playing a game that even experts believe requires native human intuition. We're right on the edge of beating a "Go Turing Test" as it were.

    But at the same time, while Go is complex for its massive play space, its rules are extremely simple. The parameters of interaction are extremely limited. With that constraint, this sort of approach is much more effective.

    I imagine a similar system trying to learn a game with a wide interaction space (e.g., if timing mattered, if piece placement were not restricted to discrete grid locations, if there were many types of placement in addition to locations, etc...) would flail forever, never being "data-efficient" enough to do better than chance or human.
  • AlphaGo wasn't taught any rules, nor given any guidance about Go at the start. For a more complex game, it wouldn't be too difficult to code some of the hard rules from the start.

    With enough data about previous games played between players, I'm sure an AI would be able to work out optimal timing just as it can learn about optimal placement.
  • I think we're pretty far away from a computer than can play Netrunner.
  • Five to ten years away ;)
  • You want to know when the mainstream will freak the fuck out about game-winning AIs?

    When AIs beat humans at no limit Texas Holdem.


    I suspect an AI could beat humans given time just from their own play data. But it would be extra cool if one of the data inputs was a camera looking at all the other players.
  • Andrew said:

    What's holding us back is that all these very smart people don't really, truly understand deep, multilayer convolution networks. Sure, we are really good at constructing the scaffolding for this machine that can perform very specific tasks given enough (and the right) training data. However, we don't know anything about the underlying structure of this machine once it's been trained. It's essentially a black box.

    From the outside looking in, this seems like the Polish Hand Magic problem. It's just math, right? A highly complex system, sure, but still just math.

    Or what am I misunderestimating here?
  • There is math that is easy to solve, or at least predictable to solve. There is other math that actually very, provably difficult to solve.

    The combinatorial math of Go is just math. Go ahead: just solve it. It's just math. Either black wins, white wins, or it's a guaranteed draw. Just one big old equation to solve.
  • Sure, but I'm not talking about computational intractability. Why do people say neural nets are a black box? What don't we fully grok about them?
  • Starfox said:

    Sure, but I'm not talking about computational intractability. Why do people say neural nets are a black box? What don't we fully grok about them?

    How to beat them at Go, apparently.
  • Starfox said:

    Sure, but I'm not talking about computational intractability. Why do people say neural nets are a black box? What don't we fully grok about them?

    Normally when you write an algorithm for a game, you write code that actually makes decisions relative to the game. for counter-strike you might write code that looks like this:

    Check line of sight.
    If there are bad guys:
    shoot them in the head.
    if at objective:
    perform objective
    else:
    move towards objective


    When you use machine learning, the logic is in the data. The data is so enormous that humans can't understand what it's doing. We understand how the machine learning framework works, because that's the code they wrote. They wrote very little code about the actual game of Go.

    Look at this game of Go.
    Adjust your enormous mathematical model according to what you see in that game.
    Repeat a hojillion times.

    Ok, here is a game of Go that isn't finished yet. Combine this with your training data, and output the next move that has the highest chance of winning.


    Nobody coded the logic. The logic is in the model created by all that training data. It's so enormous a human brain can't hold it.
  • What I'm looking forward to is when they hook a machine learning system up to tumblr and instruct it to create the most follow-able blog.

    I think that may have already happened.
  • What I'm looking forward to is when they hook a machine learning system up to tumblr and instruct it to create the most follow-able blog.

    I think that may have already happened.

    That's easy. Even easier to do the most-retweeted/liked tweet.
  • To continue a conversation here, rather than in the podcast thread:

    Of course, if it keeps improving and playing itself, it will get better, so in the future no human player will ever beat it.

    What I'd love to see is a fork of AlphaGo, so THIS WEEK'S AlphaGo can remain unmodified for years to come. That way other players will get a chance to match wits with AlphaGo at the same level as Lee Sedol. If AlphaGo Fork can only learn from playing human players from now on, it will become a great baseline for other players both human and machine.
  • To continue a conversation here, rather than in the podcast thread:

    Of course, if it keeps improving and playing itself, it will get better, so in the future no human player will ever beat it.

    What I'd love to see is a fork of AlphaGo, so THIS WEEK'S AlphaGo can remain unmodified for years to come. That way other players will get a chance to match wits with AlphaGo at the same level as Lee Sedol. If AlphaGo Fork can only learn from playing human players from now on, it will become a great baseline for other players both human and machine.

    Certainl the AlphaGo developers have their source code in version control. So getting the source code from that exact version will be quite easy.

    The problem is the data. People usually don't version control their data. AlphaGo's data is probably quite large, and difficult to distribute unless you want some spinning disks mailed to you. It also takes quite a bunch of computers to run it, and I don't foresee even the top level Go players spending thousands to spin up an entire clould infrastructure to run this thing.

    Maybe way down the line computers will be more powerful, and AlphaGo will be something you can run on your phone. By then you'll be able to set it at a difficulty level effectively equal to what was used last week.
  • Yeah, I get that it won't happen, but I'd like it if it did. Dialing down the difficulty level just isn't the same as playing against the same machine, with the data and learning all frozen at this historic point of time.
  • Of course, if it keeps improving and playing itself, it will get better, so in the future no human player will ever beat it.

    This isn't exactly true. The way AlphaGo learns means that it's easy for it to forget or unlearn strategies and moves as it plays more games. Essentially it weights previous high scoring moves less if it learns against a certain datatype or game style. It's totally possible for AlphaGo to play for another whole year and actually perform worse against humans.

  • edited March 2016

    Yeah, I get that it won't happen, but I'd like it if it did. Dialing down the difficulty level just isn't the same as playing against the same machine, with the data and learning all frozen at this historic point of time.

    No, there really isn't a good reason why it shouldn't happen.
    Apreche said:

    The problem is the data. People usually don't version control their data. AlphaGo's data is probably quite large, and difficult to distribute unless you want some spinning disks mailed to you. It also takes quite a bunch of computers to run it, and I don't foresee even the top level Go players spending thousands to spin up an entire clould infrastructure to run this thing.

    Actually, the underlying neural networks are really not that large; judging by the Nature paper it's 13/14 layers with only a couple of thousand weights to store per level. You could easily fit all that into a couple of megabytes or so of storage.

    As for the way it's run, most of it is many parallel GPUs with copies of the value and/or policy network, and then many parallel CPUs to run MCTS rollouts. Yes, you need that parallelism to run it at its level of play against Lee Sedol, but you could always just run it on a less powerful hardware setup for a longer period of time to match that level.
    Post edited by lackofcheese on
  • If you were playing the exact version that Lee Sedol beat in game 4 would AlphaGo make the exact same moves as long as the player continued to copy the same moves as Lee Sedol?

    I'd assume it would but this is new stuff.
  • Hopefully not! Wouldn't make a very good AI if you can just memorize ~80 moves to guaranteed beat it.

    I'd assume it's (pseudo)random.
  • The watt for watt argument has been floating around the go community ever since DeepMind first published their results back in January. I've heard it mostly from people who want to maintain a sense of superiority over computers, moving the goal posts in a sense.

    I know this goes towards the "is this even a fair fight" question, but I'm not entirely convinced it's a valid one. If it is then players in the NBA should be mandated to all be the same height and body weight so the teams are "equal". Then again, we do try to maintain weight differences in boxing.
  • The watt for watt argument I think is no good as an ego shield, for sure. Even the analogy he uses about an F1 car. It's like yeah, cars are less efficient than bicycles. And as much as we hate cars now, the relatively inefficient combustion engine changed the world with its power. And the jet engine changed it again. And the rocket changed it again. Extreme and raw power that is under control is how we get shit done. We just haven't been able to control that power in such a way as to play Go well, until now.

    That being said, it does say something about the human brain and the biological computer. It is ridiculously more efficient than transistor-based computing. We gotta keep working on that, because we are burning up a lot of energy on these data centers.
  • Starfox said:

    Hopefully not! Wouldn't make a very good AI if you can just memorize ~80 moves to guaranteed beat it.

    I'd assume it's (pseudo)random.

    Well, I believe the only (pseudo)random component in the AlphaGo is random move selection in the rollout policy for Monte Carlo Tree Search. As such, it is indeed likely that it would play differently every time it's run.

  • I don't think it can be hugely random though, both games 2 and 4 had very very similar openings, the first 10 or 11 moves if I recall correctly. AlphaGo only changed her responses in game 4 after Lee Sedol changed one of his. Maybe the pattern matching just naturally led to it?
Sign In or Register to comment.