Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It does not, at least not explicitly, which is a distinguishing feature of policy optimization algorithms. I don't think there's anything about this that makes it worse for non-deterministic problems. Note that the policy can still be stochastic, if desired (not sure if that's a good idea in general). A nice feature of black box policy optimization is that it's almost trivial to apply to either stochastic or deterministic problems.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: