Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Where have you seen this? Look at what vLLM or TensorRT will do on a 4090 at those batch sizes.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: