Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> AlphaGo isn’t a pure neural net at all — it’s a hybrid, melding deep reinforcement learning with one of the foundational techniques of classical AI — tree-search

Most board game computer players use some sort of tree search followed by evaluation at the leaves of the tree. What we discovered in the 70s is that you don't need to have human-level evaluation to win at chess; it is enough to count material and piece activity, plus some heuristics (pawn structure, king safety...); computers more than compensate this weakness with their superhuman tree exploration.

This approach never worked so well for Go because evaluation was a mystery: which group is weak or strong? how much territory will their power yield? These are questions that professionals answer intuitively according to their experience. With so many parts of the board that depend on each other, we don't know how to solve the equation.

It looks like AlphaGo is the first one to get this evaluation right. At the end of the game, his groups are still alive and they control more territory. So Go evaluation is yet another task that used to be reserved to human experts and that computers now master. The fact that this is mixed with classical tree search does not make it less impressive.



I agree that the main strength of AlphaGo seems to be evaluation, using supervised learning + reinforcement learning.

What I found interesting about AlphaGo's final algorithm is that there are so many different methods being used at once:

0. there's the monte carlo tree search. while this is definitely a "classic" tree search, this particular tree search algorithm is a fairly recent development, and relies heavily on statistics, which is perhaps somewhat less classical

1. the policy function approximation they use in the final algorithm, aka the policy network, is based on supervised learning + deep network model. but it is NOT the other policy network in the paper that was further tuned used reinforcement learning - that one made the overall system perform worse!

2. the value function approximation they use in the final algorithm isn't just a network. it's a linear combination of a network and a rollout approximation using a much weaker, faster, simpler evaluation function trained on different features. they find the system performs best when each is given an equal weight.

3. from what i understand, the value network is trained (at huge computational cost, particularly in generating the data set required) to give similar accuracy to the value function one could define by using the reinforcement-learning policy network. the value network gives similar valuations but runs 1500x faster. in some sense this isn't terribly algorithmically interesting - it is just an implementation detail to give faster results at game-time at the cost of a ridiculous amount of offline computation.


Computer Go actually had advanced a long way by using Monte Carlo Tree Search in particular. The pre-AlphaGo programs that AlphaGo defeated were much stronger than computer Go programs from before the era of Monte Carlo tree search. Computer Chess was not achieved instantly by applying generic "tree search" but required quite a bit of tweaking to the various algorithms which were applied.


The author of this post (Gary Marcus) is a huge proponent of hybrid systems, in fact he is using that technique for his current stealth startup: https://www.technologyreview.com/s/544606/can-this-man-make-...


Yep.

Not to be harsh, but Marcus has been critical of Neural Nets for a while now. His claims that there are issues around the provability of them are well made.

But.. there is a way to make people listen to you. It's called results. Deep Learning is getting them, in an increasing number of diverse fields.


But.. there is a way to make people listen to you.

Hype, right?

Choosing problems for their theatrical effect, rather than utility. Writing articles and research papers as if they're marketing pamphlets. Claiming that incremental improvements are paradigm shifts. Treating arbitrary achievements as if they were commonly agreed upon milestones all along.

All of this is happening right now. Being skeptical in such environment is the only right thing to do.

Deep Learning is getting them, in an increasing number of diverse fields.

If you look for practical applications that give tangible benefits to people outside of academia, the achievement of applied deep learning so far aren't nearly as impressive as you make them out to be. This is despite insane levels of hype, huge investments in research and amounts of computing power available.

Heck, if anything, the fact that AlphaGo needs to use a tree search to prop up its ANN components could be seen as a sign that ANNs have some serious practical limitations when it comes to "results". Which is kind of the point of the article.


No doubt there is plenty of hype. From where I sit though, a lot of it is justified (Not the general intelligence stuff of course).

Choosing problems for their theatrical effect, rather than utility. Writing articles and research papers as if they're marketing pamphlets. Claiming that incremental improvements are paradigm shifts. Treating arbitrary achievements as if they were commonly agreed upon milestones all along.

I'm not sure what to say to this.

There are no "commonly agreed upon milestones". The closest things are the academic benchmarks/shared tasks that you seem to be critical of.

I guess the closest thing you'll find to a "commonly agreed upon milestone" is something like the Winograd schema[1]? Based on progress like "Teaching Machines to Read and Comprehend" I wouldn't be betting against deep learning on that.

If you look for practical applications that give tangible benefits to people outside of academia, the achievement of applied deep learning so far aren't nearly as impressive as you make them out to be.

Could you explain what you expecting? Deep learning techniques aren't exactly wide spread yet, and outside Google and a few other companies it takes time for things to migrate into products and have tangible benefits.

Nevertheless, Google Search, Pinterest, Facebook image tagging, Android Voice Search etc, etc.. these all are used by billions of people daily. I think it's hard to argue there isn't at least some practical applications.

[1] https://en.wikipedia.org/wiki/Winograd_Schema_Challenge

[2] http://arxiv.org/abs/1506.03340


Results are important, but don't worship short-term results at the expense of everything else.

Deep learning (and other machine learning techniques that are forced to call themselves deep learning to get attention) are getting great results right now, yes. It's important to follow these results, to use them, and to try to understand them.

But when this approach hits a local maximum, do you want AI Winter #3, or do you want there to be another approach that people have been working on?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: