You can already do that on most desktop GPU's (even going as far as prev gen Nv ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		mindcrash 3 months ago \| parent \| context \| favorite \| on: Asus Ascent GX10 You can already do that on most desktop GPU's (even going as far as prev gen Nv 1050/1060/1070 for example). You'll need a model able to work with tools, like llama 3.2 (https://huggingface.co/meta-llama), serve it, hook up MCPs, include a STT interface, and you're cooking.

bayindirh 3 months ago [–]

Even a bottom of the barrel N95 has audio acceleration features helping with speech to text, but the LLM inference part still will be far from being efficient.

Plus, you need to keep the card at "ready" state, you can't idle/standby it completely.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact