The recipe example or any any LLM use case seems like a very poor way of highlighting “inference at the edge” given the extra few hundred ms round trip won’t matter.
The better use case is obviously voice assistant at the edge. As in voice 2 text 2 search/GPT 2 voice generated response. That is where ms matter but it is also a high abuse angle no one wants to associate with just yet. My guess is they are going to do this in another post, and if so they should make their own perplexity style online-gpt. For now they just wanted to see what else people can think up by making the introduction of it boring.