The method he's describing requires no downtime or stalled requests. Connections are drained from some pool in a load balance set if servers. The services is restarted with the new code. The hosts are given traffic again once the are initialized and healthy.
The advantages to this method include being completely platform agnostic, as well as giving you a window to verify successful update without worrying about production traffic.
Thanks for the clarification. The downsides to that approach are that you need multiple machines, and the duration of your deploys is much longer. Not to mention, you'd have to script a deploy process across multiple machines (which is not easy, in the way that "SIGHUP Gunicorn" is easy).
Personally I've found the "put new instances into a load balancer" method to make more sense for system changes (packages, kernels, OS versions) where deploying the change is inherently slow or expensive, but the method doesn't make sense for code deploys where deploy time is important.
The advantages to this method include being completely platform agnostic, as well as giving you a window to verify successful update without worrying about production traffic.