Suggestion: run the identical prompt N times (2 identical calls to Gemini 3.0 Pr...

matjet · 2026-02-13T11:22:21 1770981741

Look what they need to mimic a fraction of [the power of having the logit probabilities exposed so you can actually see where the model is uncertain]

kfajdsl · 2026-02-13T22:37:45 1771022265

All the LLM logprob outputs I've seen aren't very well calibrated, at least for transcription tasks - I'm guessing it's similar for OCR type tasks.

energy123 · 2026-02-14T10:11:00 1771063860

"I already decided in my private reasoning trace to resolve this ambiguity by emitting the string '27' instead of '22' right here, thus '27' has 100% probability"