I hear you. And maybe you're right. Maybe I'm deluding myself, but: when I look at my skilled colleagues who vibecode, I can't understand how this is sustainable. They're smart people, but they've clearly turned off. They can't answer non-trivial questions about the details of the stuff they (vibe-)delivered without asking the LLM that wrote it. Whoever uses the code downstream aren't gonna stand (or pay!) for this long-term! And the skills of the (vibe-)authors will rapidly disappear.
Maybe I'm just as naive as those who said that photographs lack the soul of paintings. But I'm not 100% convinced we're done for yet, if what you're actually selling is thinking, reasoning and understanding.
The difference with a purely still photograph is that code is a functional encoding of an intention. Code of an LLM could be perfect and still not encode the perfect intention of the product. I’ve seen that in many occasions.
Many people don’t understand what code really is about and think they have a printer toy now and we don’t have to use pencils.
That’s not at all the same thing.
Code is intention, logic, specific use case all at once. With a non deterministic system and vague prompting there will be misinterpreted intentions from LLM because the model makes decisions to move forward. The problem is the scale of it, we’re not talking about 1000 loc. In a month you can generate millions of loc, in a year hundreds of millions of loc.
Some will have to crash and burn their company before they realize that no human at all in the loop is a non sense. Let them touch fire and make up their mind I guess.
> Code is intention, logic, specific use case all at once. With a non deterministic system and vague prompting there will be misinterpreted intentions from LLM because the model makes decisions to move forward. The problem is the scale of it, we’re not talking about 1000 loc. In a month you can generate millions of loc, in a year hundreds of millions of loc.
People are also non deterministic. When I delegate work to team of five or six mid level developers or God forbid outsourced developers, I’m going to have to check and review their work too.
It’s been over a decade that my vision/responsibility could be carried out by just my own two hands and be done on time within 40 hours a week - until LLMs
People are indeed not deterministic. But they are accountable. In the legal sense, of course, but more importantly, in an interpersonal sense.
Perhaps outsourcing is a good analogy. But in that case I'd call it outsourcing without accountability. LLMs feel more like an infinite chain of outsourcing.
As a former tech lead and now staff consultant who leads cloud implementations + app dev, I am ultimately responsible for making sure that projects are done on time, on budget and meets requirements. My manager nor the customer would allow me to say it’s one of my team members fault that something wasn’t done correctly any more than I could say don’t blame me blame Codex.
I’ve said repeatedly over the past couple of days that if a web component was done by someone else, it might as well have been created by Claude, I haven’t done web development in a decade. If something isn’t right or I need modifications I’m going to either have to Slack the web developer or type a message to Claude.
Ofc people are non deterministic. But usually we expect machines to be. That’s why we trust them blindly and don’t check the calculations. We review people’s work all the time though.
Here people will stop review machine LLM code as it’s kind of a source of truth like in other areas. That’s my point, reviewing code takes time and even more time when no human wrote it. It’s a dangerous path to stop reviews because of trust in the machine now that the machine is just kind of like humans, non deterministic.
No one who has any knowledge or who has ever used an LLM expects determinism.
And there are no computer professionals who haven’t heard about hallucinations.
Reviewing whether the code meets requirements through manual and automated tests - and that’s all I cared about when I had a team of 8 under me - is the same regardless. I wasn’t checking whether John used a for loop or while loop in between my customer meetings and meetings with the CTO. I definitely wasn’t checking the SOQL (not a typo) of the Salesforce consultants we hired. I was testing inputs and outputs and UX.
Having a team of 8 people producing code is manageable. Having an AI with 8 agents that write code all day long is not the same volume it can generate more code in a day that one person can review in a week.
What you say is that, product teams will prompt what they want to a framework, the framework will take care of spec analysis, development, reviews, compliance with spec. Product teams with QA will make sure the delivery is functionally correct.
No humans need to make sure of anything code related.
What we don’t know yet is, does AI will still produce solid code trough the years because it’s all statistical analysis and with the volume of millions of loc, refactoring needed, data migrations etc what will happen ?
For context, I just started using coding agents - codex CLI and Claude code in October. Once I saw that you had to be billed by use, I’m not using my own money for it when it’s for a company.
Two things changed - Codex CLI now lets you use it with your $20 a month subscription and I have never run into quota issues with it and my employer signed up for the enterprise vs of Claude and we each have an $800 a month allowances
My argument though is “why should I care about the code?” for the most part. If I were outsourcing a project or delegating it to a team lead, I would be asking high level architectural, security and scalability questions.
AI generated the code, AI maintains the code. I am concerned about abstractions and architecture.
You shouldn’t have to maintain or refactor “millions of lines of code”, if your code is well modularized with clean interfaces, making a change for $x7 may mean making a change for $x1…$x6. But you still should be working locally in one module at the time. You should do the same for the benefit of coders. Heck my little 5 week project has three independently deployable repos in a root folder. My root Agents file just has a summary of how all three relate via a clean interface.
In the project I am working on now, besides “does it meet the requirements”, I care about security, scalability, concurrency, user experience for the end user, user experience for the operations folks when they need to make config changes, and user experience for any developers who have to make changes long after I’m off this project. I haven’t looked at a single line of code - besides the CloudFormation templates. But I can answer any architectural question about any of it. The architecture and abstractions were designed by me and dictated to the agents
On this particular project, on the coding level, there is absolutely nothing that application code like this can do that could be insecure except hypothetically embed AWS credentials into the code. But it can’t do that either since it doesn’t have access to it [1].
In this case security posture comes from the architecture - S3 block public access, well scoped IAM roles, not running “in a VPC”. Things I am checking in the infrastructure as code and I was very specific about.
The user experience has to come from design and checking manually.
I mentioned earlier that my first stab it scaled poorly. This was caused by my design and I suspected it would beforehand. But building the first version was so fast because of AI tools, I felt no pain in going with my more architecturally complicated plan B and throwing the first version away. I wouldn’t have known that by looking at the code. The code was fine it was the underlying AWS service. I could only know that by throwing 100K documents at it instead of 1000.
I designed a concurrent locking mechanism that had a subtle flaw. Throwing the code into ChatGPT into thinking mode, it immediately found it. I might have been better off just to tell the coding agents “design a locking mechanism for $x” instead of detailing it.
Even maintainability was helped because I knew I or anyone else who touched it was probably going to be using an LLM. From the get go I threw the initial contract, the discovery sessions transcripts, the design diagrams, the review of the design diagrams, my project plan and breakdown into ChatGPT and told it to render a detailed markdown file of everything - that was the beginning of my AGENTS.md file.
I asked both Codex and Claude to log everything I was doing and my decisions into separate markdown files.
Any new developer could come into my repo, fire up Claude and it wouldn’t just know what was coded, it would have full context of the project from the initial contract through to the delivery
[1] code running on AWS never explicitly has to worry about AWS credentials , the SDKs can find the information by themselves by using the credentials of the IAM role attached to the EC2 instance, Lambda, Docker container, etc.
Even locally you should be getting temporary credentials that are assigned to environment variables that the SDK retrieved automatically.
Okay - and the person ultimately leading the team is still responsibility for it whether you are delegating to more junior developers or AI. You’re still reviewing someone else’s code based on your specs
I have this nagging feeling I’m more and more skimming text, not just what the LLMs output, but all type of texts. I’m afraid people will get too lazy to read, when the LLM is almost always right. Maybe it’s a silly thought. I hope!
People will say "oh, it's the same as when the printing press came, people were afraid we'd get lazy from not copying text by hand", or any of a myriad of other innovations that made our lives easier. I think this time it's different though, because we're talking about offloading the very essence of humanity – thinking. Sure, getting too lazy to walk after cars became widespread was detrimental to our health, but if we get too lazy to think, what are we?
there are some youtube videos about the topic, be it pupil in high school addicted to llms, or adults losing skills, and not dev only, society is starting to see strange effects
Maybe I'm just as naive as those who said that photographs lack the soul of paintings. But I'm not 100% convinced we're done for yet, if what you're actually selling is thinking, reasoning and understanding.