Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've started to suspect that generating code is actually one of the easier things for a predictive text completion model to achieve.

Programming languages are a whole lot more structured and predictable than human language.

In JavaScript the only token that ever comes after "if " is "(" for example.



On the other hand, if you want to use an external library on the line 80, you need to import it at the top.

I once asked it for a short example code of something, no longer than 15 lines and it said "here's a code that's 12 lines long" and then added the code. Did it have the specific code "in mind" already? Or was it just a reasonably-sounding length and it then just came up with code that matched that self-imposed constraint?


The latter option is closest, but neither is quite right. It would have ~known~ that the problem asked, combined with a phrase for a 15 line limit has associations with a length of 12 lines (perhaps most strongly 12, but depending on temp it could have given other answers). From there it is constrained to (complete) solutions that lead to 12 lines, from the several (partial) solutions that already exist in the weights.


I loved your example. I think that may be an obvious advantage to LLM, humans are poor at learning new languages after adolescence but a LLM can continue to learn and build new connections. Studies show that multilingual people have an easier time making connections and producing new ideas, In the case of programming, we may build something that knows all programming languages and all design patterns and can merge this knowledge to come up with better solutions than the ordinary programmer.


The more constraints there are (e.g. like your example) the better it should perform. So it disappoints me when copilot, knowing what libraries are available in the IDE it's running in, hallucinates up a method call that doesn't exist.

Separately (and apologies for going on a tangent), where do you think we are in the Gartner cycle?

Around GPT3 time I was expecting for trough of disillusionment to come, particularly when we see the results of it being implemented everywhere but it hasn't really come yet. I'm seeing too many examples of good usage (young folks using it for learning, ESL speakers asking for help and revisions, high-level programmers using it to save themselves additional keystrokes, the list is long).


> hallucinates up a method call that doesn't exist

I actually think it helps to reframe this. It hallucinates up a method call that predictively should exist.

If you're working with boto3, maybe that's not actually practical. But if it's a method within your codebase, it's actually a helpful suggestion! And if you prompt it with the declaration and signature of the new method, very often it will write the new helper method for you!


If you have a long iterative session by the end it will have forgotten the helpful hallucinations at the beginning, so then phantom methods evolve in their name and details.

I wonder if it is better at some languages than others. I have been using it for Go for a week or two and it’s ok but not awesome. I am also learning how to work with it, so probably will keep at it, but it is clearly a generative model not a thinking being I am working with.


No idea about Go, but I was curious how GPT-4 would handle a request to generate C code, so I asked it to help me write a header-only C string processing library with convenience functions like starts_with(), ends_with(), contains(), etc.) I told it every function must only work with String structs defined as:

struct String { char * text; long size; }

...or pointers to them. I then asked it to write tests for the functions it created. Everything... the functions and the tests... worked beautifully. I am not a professional programmer so I mainly use these LLMs for things other than code generation, but the little I've done has left me quite impressed! (Of course, not being a professional programmer no doubt makes me far easier to impress.)


Interesting. I haven’t tried it with C. Hopefully the training code for C is higher quality than any other language (because bad C kills). Do you have a GitHub with the output?


Hah, hadn't thought of this but kind of love that take!


Are you using it with static types at all? With TypeScript, I've found that it's quite good at producing the imperative logic, but can struggle with types once they reach a certain level of abstraction. It's interesting that even in the realm of "structured languages", it's a lot stronger at some kinds of inference than others.


> In JavaScript the only token that ever comes after "if " is "(" for example.

I'm pretty sure " " (whitespace) is a token as well, which could come after a `if` as well. I think overall your point is a pretty good one though.


> I've started to suspect that generating code is actually one of the easier things for a predictive text completion model to achieve.

> Programming languages are a whole lot more structured and predictable than human language.

> In JavaScript the only token that ever comes after "if " is "(" for example.

But isn't that like saying that it's easy to generate English text, all you need is a dictionary table where you randomly pick words?

(BTW, keep up the blog posts, I really enjoy them!)


One thing to bear in mind is that GPT training set for code is supposedly skewed very heavily towards Python.


This!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: