Take, for example, an 8-bit/byte store. Without BWX the sequence would be something like:
bic a0, #3, t1 and a0, #3, t4 ldl t2, (t1) insbl a1, t4, t3 mskbl t2, t4, t2 bis t2, t3, t2 stl t2, (t1) ret zero, (ra)
stb a1, 0(a0) ret zero, (ra)
Take, for example, an 8-bit/byte store. Without BWX the sequence would be something like:
But with BWX, it becomes: