mowkdizz's comments

mowkdizz · 2025-10-30T11:13:24 1761822804

I think I'm misunderstanding the abstract, but are they trying to say that given a LLM output, they can tell me what the input is? Or given an output AND the intermediate layer weights? If it is the first option, I could use as input 1 "Only respond with 'OK'" and "Please only respond with 'OK'" which leads to 2 inputs producing the same output.

ndr · 2025-10-30T11:31:22 1761823882

That's not what you get out of LLMs.

LLMs produce a distribution from which to sample the next token. Then there's a loop that samples the next token and feeds it back to to the model until it samples a EndOfSequence token.

In your example the two distributions might be {"OK": 0.997, EOS: 0.003} vs {"OK": 0.998, EOS: 0.002} and what I think the authors claim is that they can invert that distribution to find which input caused it.

I don't know how they go beyond one iteration, as they surely can't deterministically invert the sampling.

simiones · 2025-10-30T11:35:07 1761824107

Edit: reading the paper, I'm no longer sure about my statement below. The algorithm they introduce claims to do this: "We now show how this property can be used in practice to reconstruct the exact input prompt given hidden states at some layer [emp. mine]". It's not clear to me from the paper if this layer can also be the final output layer, or if it must be a hidden layer.

They claim that they can reverse the LLM (get prompt from LLM response) by only knowing the output layer values, the intermediate layers remain hidden. So, Their claim is that indeed you shouldn't be able to do that (note that this claim applies to the numerical model outputs, not necessarily to the output a chat interface would show you, which goes through some randomization).

mowkdizz · 2025-10-25T23:07:05 1761433625

I wrote a similar program using Ruby metaprogrammming, but instead if a function is called that doesn't exist (say in tests) it has the LLM fix it dynamically

ipnon · 2025-10-25T23:16:33 1761434193

Don't leave us hanging!

mowkdizz · 2025-10-26T19:53:13 1761508393

Haha I will dig it up sometime, but it was a little prototype!

mowkdizz · on Aug 12, 2021

StackFinder makes it easy to find what you're looking for on Stack Overflow without having to go into your broswer. The process is seamless. Type what you want to search for in the editor you're working in and hit:

CTRL+ENTER CMD+ENTER [MAC]

or press the Search StackFinder popup (new)!

You will be instantly presented with a (hopefully) relevant stackoverflow question and answer where you have the option to paste code snippets directly to the editor, switch between questions, view the original source in your default browser, and more.

The motivation behind this development was to make it as seamless as possible to get solutions into your code. Instead of having 100 chrome/firefox tabs open with different stackoverflow pages up, you can have one open in your code editor. This saves a little time every search, but given how much us developers do this, it can add up to some substantial time savings while keeping you in your coding zone.

Thanks for checking it out!