Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
j_maffe
11 months ago
|
parent
|
context
|
favorite
| on:
The Deep Research problem
But that's the thing. The only way to truly find out if it's reliable (>90%) is to check the data yourself.
SubiculumCode
11 months ago
[–]
This is why metrics and leaderboards like these are so important (but under reported on):
https://github.com/vectara/hallucination-leaderboard
https://www.kaggle.com/facts-leaderboard
Google Gemni models seem to lead...hopefully the metrics aren't being gamed.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: