I've a limited understanding of the topic, but would this allow to run an LLM in a mobile phone in offline mode? If that's feasible, it'd pave the way to lots of interesting applications, such as AI-assisted content moderation without having to phone back confidential data.
Yes, this may (significantly) improve that. Even without that, you can run LLMs already on mobile phones, the question is just how big of a model and how strongly quantized, and if the few models that remain produce good enough results.