Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The recipe example or any any LLM use case seems like a very poor way of highlighting “inference at the edge” given the extra few hundred ms round trip won’t matter.


The better use case is obviously voice assistant at the edge. As in voice 2 text 2 search/GPT 2 voice generated response. That is where ms matter but it is also a high abuse angle no one wants to associate with just yet. My guess is they are going to do this in another post, and if so they should make their own perplexity style online-gpt. For now they just wanted to see what else people can think up by making the introduction of it boring.


There’s three options for inference: 1) On device inference 2) Inference “on the edge” 3) Inference in a data center

Given fly is deployed in equinox data centers just like everyone else, fundamentally there isn’t much difference between #2 and #3.


This. I cannot think of a business case for running LLMs on the edge. Is this a Pets.com moment for the AI industry?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: