My go-to for embarrassingly parallell problems that are still single-computer scale, using the multiprocessing backend. Like computing features across a bunch of inputs. Does the job in a simple and pain free manner. Have got some wrapper for progress indicators using tdwm.
PS: The synchronous backend switch is great for getting better backtraces, like in a debugger.
PS: The synchronous backend switch is great for getting better backtraces, like in a debugger.