Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
nik736
3 months ago
|
parent
|
context
|
favorite
| on:
Asus Ascent GX10
Which models will this be able to run at an acceptable token/s rate?
simlevesque
3 months ago
[–]
gpt-oss:120b
https://til.simonwillison.net/llms/codex-spark-gpt-oss
hamdingers
3 months ago
|
parent
[–]
Am I missing it or is there no information about performance? Looking for a tokens/sec
aseipp
3 months ago
|
root
|
parent
|
next
[–]
Right now I get 59 tok/sec on GPT-OSS 120B using Unsloth's dynamic 4-bit quants, via llama.cpp
https://news.ycombinator.com/item?id=45881049
simlevesque
3 months ago
|
root
|
parent
|
prev
[–]
He didn't give that info but the transcript linked at the end shows how much time was spent for each query.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: