The long multiplication algorithm has two nested inner loops, each reducing one of the inputs to digits (dividing and multiplying by fixed values), multiplying them with a lookup table, then performing add-with-carry against inner and outer accumulators.
The Russian Peasant Algorithm optimizes this for the binary case by noticing that the inner multiply is always by 0 or 1 so it's trivial and never carries, removing the inner loop entirely and allowing you to use only one accumulator.