Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nice find, I think we'll get there eventually though, as models can hold more state.


We are already there.

100% accuracy on up to 13 digit addition can be taught to 3.5 as is.

https://arxiv.org/abs/2211.09066

And 4 has little need for such out the box


> in this work, we identify and study four key stages for successfully teaching algorithmic reasoning to LLMs: (1) formulating algorithms as skills, (2) teaching multiple skills simultaneously (skill accumulation), (3) teaching how to combine skills (skill composition) and (4) teaching how to use skills as tools.

So it's not an emergent property of LLM but 4 new capability trainings. Noone is saying you can't teach these things to an Agent, just that these are not emergent abilities of the LLM training. by default a LLM can only match token proximity, all trainings of LLMs improve the proximity matching (clustering) of token but they do not teach algorithmic reasoning. it needs to get bolted on as addon.


No it doesn't need to be bolted on. GPT-4 can add straight out of the box, no need for any education. Where the model hadn't implicitly figured out the algorithm of addition in 3.5, it has in 4.


Maybe but since we can't by definition know what is present in the model we can not define any behavior as emergent as opposed to simply trained. Suppose you don't know anything about our school system and you observe that 12-graders know calculus whereas 3-rd graders do not. In their definition of emergent Calculus is an emergent ability in 12 graders because it was not present in 3-rd graders. ofc we know that it is not an emergent ability but result of new expanded mental model trained on bigger corpus of knowledge.


So, guess they should be saying "trainable" instead of "emergent". Still a useful benchmark of course.

To be truly emergent in your sense it seems an LLM would have to make a new discovery, i.e., have scientist-level intelligence. That bar keeps moving up.


not neccessary a new discovery just a new behaviour for which it was not trained on. See https://en.wikipedia.org/wiki/Emergence a classic example is structure of a flock of starlings in flight or a school of fish, the flock and the school of fish move in an emergent behaviour that is not observed on single (or few) fish or starlings.

Something like this may well yet emerge if a new AI agent learns how to combine the properties of a LLM with an algorithmic approach, fact-checking or a general reasoning engine. But for that we are still waiting for another breakthrough to combine these isles into one (without bolting them manually on each other)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: