Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But that's the thing. The only way to truly find out if it's reliable (>90%) is to check the data yourself.


This is why metrics and leaderboards like these are so important (but under reported on): https://github.com/vectara/hallucination-leaderboard https://www.kaggle.com/facts-leaderboard

Google Gemni models seem to lead...hopefully the metrics aren't being gamed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: