The mere token prediction comment is wrong, but I don't think any of the other c...

quonn · 2025-06-11T11:31:19 1749641479

> that they can internally represent many different complex ideas efficiently and coherently

The Transformer circuits[0] suggest that this representation is not coherent at all.

iNic · 2025-06-11T12:00:07 1749643207

I guess that depends on what you think is coherent. A key finding is that the larger the network the more coherent the representation becomes. One example is larger networks merge the same concept across different languages into a single concept (as humans do). The addition circuits are also fairly easy to interpret.

quonn · 2025-06-11T13:11:27 1749647487

> merge the same concept

It's doing compression which does not mean it's coherent.

> The addition circuits are also fairly easy to interpret.

The addition circuits make no sense whatsoever. It's doing great at guessing that's all.

iNic · 2025-06-16T08:02:57 1750060977

I am curious, what would you count as coherent? I think it is absolutely insane that we can open and understand what are essentially alien intelligences at all!