Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's why you can use latest open coding models locally that reportedly reached the performance of Sonet-4.5 so almost SOTA. And then you can think of tricks like I mentioned above to directly manipulate GPU RAM for context cleanup when needed which is not possible with cloud models unless their provider enables that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: