Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> What does that even mean?

It explicitly says "Results on AIME and GPQA are really strong". So I would assume it means it can get (statistically significantly, I assume) better score in AIME and GPQA benchmarks compared to 4o.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: