Eh, that’s not at all how I do it. I like to design the architecture and spec and let them implement the code. That is a fun skill to exercise. Sometimes I give a little more leeway in letting them decide how to implement, but that can go off the rails.
imho “tell them what you want and let them come up with a solution” is a really naive way to use these tools nearly guaranteed to end up with slopware.
the more up front design I’ve given thought to, they are usually very accurate in delivering to the point I dont need to spend very much time reviewing at all. and, this is a step I would have had to do anyway if doing it by hand, so it feels natural, and results in far more correct code more often than I could have on my own, and allows multitasking several projects at once, which would have been impossible before.
Modern "skills" and Markdown formats of the day are no different than "save the kittens". All of these practices are promoted by influencers and adopted based on wishful thinking and anecdata.
Uh, this couldn't be more false. I've implemented these from scratch at my company and rolled them out org-wide and I've yet to watch a youtube video and don't consume any influencers. Mostly by just using the tools and reading documentation - as any other technical tool.
Perhaps your blanket statement could be wrong, and I would encourage you to let your mind be a bit more open. The landscape here is not what it was 6 months ago. This is an undeniable fact that people are going to have to come to terms with pretty soon. I did not want to be in this spot, I was forced to out of necessity, because the stuff does work.
To be fair, if you have never watched a YouTube video in your life then how can you say the OP was wrong about what influencers are peddling? Side note, have you ever seen that Onion article on the man that can't stop telling people he doesn't own a TV?
Great, so how do you know this stuff works? Did you evaluate it against other approaches? How do you know it's actually reliable?
The Vercel team had some interesting findings[1]:
> In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it.
Others had different findings for commonly accepted practices[2], some you may have adopted from reading documentation, which surely didn't come from influencers.
And yet others swear by magical Markdown documents[3].
So... who is the ultimate authority on what actually works, and who is just cargo culting the trendy practice of the week? And how is any of this different from what was being done a few years ago?
Sorry, but from your first comment, I don’t particularly feel inclined to help you figure this out. I was just offering I’ve already deployed these things at a scale with success using many of the configuration options offered as documentation in the op here. this stuff isn’t some mystical blackbox, although you seem to think it is.
I measure the tooling success with a suite of small prompt tests performing repeatable tasks, measuring the success rate over time, educating the broader team, and providing my own tried and tested in the field skills that I’ve shared to similar successes to the broader teams. We’ve seen a huge increase in velocity and lower bug rate, which are also very easily measurable (and long evaluated stats) enough to put me in the position I am, which was not a reluctant one. You’re perfectly free to view my long history on this topic on this forum to see I am a complete skeptic on this topic, and wouldn’t be here unless I had to.
everyone is figuring this out still. There is no authority, I am my own authority on what I have seen work and what hasn’t. Feel free to take of that what you will. I just wanted to provide a counterpoint to your initial claim. I’m certainly not going to expose to a fine degree what has worked for my org and what hasn’t due to obvious reasons.
what? non techies are most at risk. There are a huge number of malicious skills. Not knowing or caring how to spot malicious behavior doesn’t mean someone shouldn’t be concerned about it, no matter how much they can’t or don’t want to do it.
I am an adminstrator of this stuff at my company and it’s an absolute effing nightmare devising policies that protect people from themselves. If I heard this come out of someone’s mouth underneath me I’d tell them to leave the room before I have a stroke.
And this is stuff like, if so and so’s machine is compromised, it could cost the company massive sums of money. for your personal use, fine, but hearing this cavalier attitude like it doesn’t matter is horrifying, because it absolutely does in a lot of contexts.
I run a small local non-profit which is essentially security hardening guide with some helper tooling that simplifies some concepts for non-techies (FDE, MFA, password managers etc).
LLMs have completely killed my motivation to continue running it. None of standard practices apply anymore
I am certainly not an expert but I agree a lot with your sentiment about the hubris - but the problem as presented in the article makes no sense to me.
If you see a value need for a receptionist, and you suspect that it is costing you thousands of dollars, wouldn't a normal response be, "I should think about hiring someone," rather than turning to an unproven, untested solution like this and leaving your business at the hands of how correct it is? I just cannot understand this line of thinking at all, reaching for a tool that would probably do a worse job than a human would do. Is it not wanting to hire? Not wanting to manage? Hype cycle? Where does this urge come from?
To take this further, if the focus really is the "luxury" part of the market, how do they expect this sort of response to go down well with customers?!
If someone is interested in paying luxury size fees, do they really want some cobbled together chatbot? I say this as an advocate for (high quality) chatbots for various practical needs, but it just seems like it is misunderstanding the customers (or maybe luxury is a bit of a loose term new in the area this mechanic works in?)
These customers own expensive cars - or at least, cars that were expensive when they were new. The car might now be ten years old or more, and the owner bought it used. They want a prestige marque, but the customer does not have the money to buy a new prestige car. So they are looking to save on service.
All the time I see cars with expensive names - BMW, Mercedes Benz - broken down on the side of the road, while old Hondas and Toyotas keep cruising by. Those are the customers for this shop: they spent all their money buying an expensive used car, and now they can't afford to maintain it and fix looming problems; meanwhile the Toyota or Hyundai driver gets maintenance and maybe even takes it to the dealer for it.
A mechanic like this can't afford to hire someone to answer the phone. Such a person is expensive, and these customers want rock-bottom prices despite the car being expensive. So a chatbot is good enough and better than nothing.
The most trustworthy mechanic I used in England had an appointment book pretty much full for four months in advance. He didn't answer the phone, didn't have a computer, just a desk diary. If you wanted him to work on your car you turned up at his workshop and spoke to him. If you were willing to wait until he'd finished whatever thing he was doing he'd take a quick look at your car and suggest a course of action. And despite his full order book if something looked urgent enough and small enough he'd fit you in quite quickly.
He charged reasonable prices, but definitely not rock bottom. He had no need to compete with the bottom feeders because every customer acted as his public relations agent.
Business owners tend to resent having to rely on and pay their workers.
Many of them believe people should line up and volunteer/be forced to work at their companies for free, the fact that they have to pay them is an insult.
They need workers, but workers are not worthy of being needed by them, or paid, so they look for any out at all.
The word you’re looking for is greed. These systems are greed enablers. The narrative used to pump them plays on greed. And so on.
Hiring a person for the job is 3000$ per month? Great let’s try to do this with 500$ and a tangle of vibecoded toothpick bridges!
For a luxury service with generous margins this is a failure-prone mentality.
They'd still try to replace workers, even if their attempted route of automation cost them more than hiring employees would, because of their resentment towards them
Aside from a cost? It's also managing the actual human being, and making sure they have enough work. If the place has 5-10 calls a day, then it's pointless to hire receptionist that will do nothing for 1 hour, and then get 2 minutes chat. It used to be pointless to build software to do that, but since claude code it's cheap enough to make sense.
receptionist as a service has been a thing for like... forever. You are never going to solve the problem of accurately estimating and quoting with AI or an answering service, so pay for someone to answer the phone and take down the details; have a mechanic or trained service rep review and estimate. Cheap code that doesn't solve the problem is not cheap.
Yes, of course. The bot can request information and the customer can provide it if they feel like it, and then someone qualified can call them back when they have their hands free.
But there's no bot, per se, needed at all. An answering machine from 1993 can do this same information-gathering job. :)
So update the device from 1993's new-fangled digital answering machine to 2009's Google Voice, and have it do the transcription from voicemail to text.
Someone will still have to call Bill back about his Honda (which is actually the Kia he bought for his daughter -- Bill is not a very technical guy these days[1] and he confuses such concepts regularly) in order to get any trading of money for services done.
It doesn't take an LLM to get there, and Bill would probably prefer to avoid being frustrated by the bot's insistent nature.
Look, you‘re kicking an open door.
I think LLMs applied like this are just a layer of complexity that os mostly replacing lower level programming solutions that could do the same thing
The transcription + callback loop is honestly underrated.
Most of the value here is just capturing intent accurately
("Honda" vs "Kia" aside) so the mechanic can prioritize
callbacks. A dumb voicemail-to-text pipeline handles that
fine. The LLM layer adds complexity without solving the
actual bottleneck, which is someone qualified picking up
the phone.
But I'm not sure that a bot can be trusted to make good decisions about priority, either. So even if it makes good decisions based on context (which it can increasingly-often do, but does not always do), it lacks the context that is necessary to form the basis of good decisions.
Suppose a message comes into the box with this form: "This is Wendy, can you call me? My car is making that noise again."
The bot might deprioritize that call because it lacks actionable contextual information. "My job as a bot is to get more jobs into the shop. This call does not have enough data to do that, so I'll shove to the bottom of list of callbacks behind more-actionable jobs."
But the mechanic? The mechanic knows Wendy's Ford very well, and he also knows Wendy. She's a been a good customer for over a decade. The mechanic also knows the noise, and that Wendy has 3 little kids and that she's vacationing 900 miles away on a road trip with those kids in that Ford. The context is all there inside of the mechanic's brain to combine and mean that this might be the highest-priority call he gets all week.
Wendy may not have actively relayed any urgency in her message, but the urgency is real and she needs called back right away. She needs answers about what to do (keep driving and look into it when she gets back? pull over immediately and get a tow to a decent local shop? maybe she even needs help finding such a shop?) pretty much immediately. Not because it means more business today, but because it means more business for years.
The mechanic can spot this from a list of transcripts in an instant and give her a ring back Right Now. The bot is NFG at this.
The addition of the bot only adds noise to the process, and that noise only works to Wendy's detriment. When the bot adds detrimental noise to Wendy's situation, it also adds detriment to the shop's longevity.
The presence of the bot -- even as a prioritizing sorting mechanism -- asymptotically shifts the state from an excellent shop that knows their customers very well to a bot-driven customer-averse hellscape.
(And no, the answer isn't to make the bot into an all-knowing oracle that actively gets fed all context. The documentation burden would be more expensive, time-wise (and thus money-wise) than hiring a competent human receptionist who answers the phone, handles the front door traffic, and absorbs context from their surroundings. A person who chatted with Wendy last Thursday right before she left for her trip is always going to be superior to a bot.)
If someone put on their website and voicemail that they were available for calls only from 8-10am (for example), or that they would return my call at that time, I'd make a point to call them then. It's reasonable that people are busy too.
Because the capital owning class in America commonly has an aversion to labor.
Labor is other humans and all their social hierarchy monkey brain bullshit activates in a way that a machine doesn’t. That’s why you’ll see companies spending equivalent or even slightly more money for a tool to do a job over a human being.
Walmart employs this amount of workers only because it is subsided by food stamps and other government assistance. The minute they were forced to actually pay for the labor they employ would fire a lot of people
You are suggesting that if the government gives you a tax break, your boss would lower your salary? Why does your boss wait for the tax break or handout and doesn't just lower your salary now?
Also what's your counterfactual here? If Walmart fired their employees tomorrow and replaced them with robots, those ex-employees would magically no longer need food stamps nor government assistance? (Or more realistically: Walmart could pivot to the Aldi model of labour and replace many low intensity jobs with fewer higher intensity jobs. For the affected workers, the outcome is the same.)
If those ex-workers don't magically get off government assistance, if Walmart is out of the picture, in what sense is Walmart to blame for their poverty?
Conversely: if Walmart laying off these workers would magically improve their welfare, why do these workers wait for Walmart to lay them off?
> Walmart could pivot to the Aldi model of labour and replace many low intensity jobs with fewer higher intensity jobs.
Yes, this is the expected change.
> For the affected workers, the outcome is the same.
No? There are two classes of affected workers:
1. Workers who have been converted to full-time with benefits. These workers benefit from the change.
2. Workers who lose their jobs. These workers are worse off.
Your argument ignores class 1.
I don't think we'll get anywhere debating the relative merits of the tradeoff of those two groups, but I personally prefer the existence of class 1. At least with that class there are some winners.
There's practically no (1). It's a different class of workers, of people than who Walmart currently employs at low intensity and low pay.
People who prefer a higher intensity, higher paying job than the bottom rung at Walmart can already get that kind of job today. They don't need to wait for Walmart to fire everyone else.
Walmart has some of these jobs already, probably. But Aldi and other companies exist. The whole Jeff Bezo's workout at Amazon Warehouses falls in a similar category too: Amazon pays pretty well for the sector and requires no prior experience, but they expect you to stay on your feed throughout.
> Walmart employs this amount of workers only because it is subsided by food stamps
And then those food stamps are used at Walmart, its a win win for Walmart and Walmart. No other country gives their poor food stamps instead of money, I wonder why?
I'm projecting, but I think you're right. Not wanting to manage is probably a large driver. I can imagine that if you've dealt with messy humans before, that a robot receptionist that's not going to show up late, call out when hungover, need an advance for a family member's surgery and then quit, is quite attractive.
Until the robot breaks for reasons unknown and you have to pay for expensive engineering time to fix it. Surprise, since the engineer vibe coded the whole thing, he also has no idea how to fix it except to get the AI to try.
> If you see a value need for a receptionist, and you suspect that it is costing you thousands of dollars, wouldn't a normal response be, "I should think about hiring someone," [...]
If you only have thousands of dollars is savings from the move, hiring someone might be too expensive.
One jet landing every minute, coordinating the airspace for miles around the airport, along with coordinating non-landing traffic (helicopters, small craft), while making sure these (already heavily automated) flight systems dont get confused and kill several hundred people sounds easy to you, along with keeping everything on time and schedule?
You say it “…sounds like a simple problem,” and sure, if you think this is a computer problem, it sounds simple. But if all you’re getting back is indignant sputtering, that’s your cue to explain why it’s simple—explaining something simple shouldn't be hard. What do you actually know?
It takes all of two minutes of Wikipedia reading for me to understand why this isn’t simple; why it's actually extremely not simple! If you ignore the incumbency, the regulations, the training requirements, the retrofitting, the verification, the international coordination, and the existing unfathomably reliable systems built out of past tragedies, then sure, it’s "simple". But then, if you're ignoring those things, you’re not really solving the problem, are you?
If you ignore the incumbency, the regulations, the training requirements, the retrofitting, the verification, the international coordination, and the existing unfathomably reliable systems built out of past tragedies, then sure, it’s "simple".
Those are excuses and encumbrances, not reasons. If they are so important, it leads to a question: what existing automated systems can we improve by adding similar constraints?
If these are just "excuses" and not "reasons," then explain how you have determined them as such.
I would like to say, "Because knowledgeable people have explained the difference to me." But again, this has come up before, and no explanations are ever provided. Only vague, reactionary hand-waving, assuring me that humans -- presumably not the same ones who just directed a fire truck and an aircraft onto the same active runway, but humans nevertheless -- are vital for safety in ATC, because for reasons such as and therefore.
There you are doing it in order to avoid engaging with the substance of what people are saying.
There is no substance in the replies. There never is. Only unanchored FUD.
Ok. You have shared that what some say are reasons, you say are excuses. Do you want to be told you are right, or do you want to propose a valid solution? If the latter requires the former, I maintain that this is not a simple problem.
I just want what I've been asking for: someone to explain to me why, in 2026, humans still need to be involved in the real-time aspects of ATC.
"Because it's always been done that way, and that's what the regulations say," will not be accepted, at least not by me.
(Really, my question is more like why humans will still be needed in the loop in 2036. If we started automating ATC today, that's probably how long it would take to cut over to the new system.)
If you ignore the incumbency, the regulations, the training requirements, the retrofitting, the verification, the international coordination, and the existing unfathomably reliable systems built out of past tragedies, then sure, it’s "simple". But then, if you're ignoring those things, you’re not really solving the problem, are you?
You retorted.
Those are excuses and encumbrances, not reasons.
I rebutted.
Ok. You have shared that what some say are reasons, you say are excuses... I maintain that this is not a simple problem.
Which you ignored to make a new claim against a straw man.
I just want what I've been asking for: someone to explain to me why, in 2026, humans still need to be involved in the real-time aspects of ATC.
That is what is not acceptable. You cannot simply abandon your original claim because it has been plainly pointed out that it is incorrect. You were not simply asking for someone to explain why humans need to be involved in real-time aspects of ATC. That is a wholly different question! You claimed this problem was simple, and it has been explained to you why it is not. Please reason about your argument more soundly.
On the heels of tragedy, you reasoned this could've been avoided simply. We are all ears. And yet, at no point did you demonstrate any understanding of the problem containing real world constraints, and instead demand that it be explained to you how the world works and how systems are implemented.
If you want to discuss an idealized system in a vacuum, then say as much; I would find that interesting. But do not demand to be given an explanation when you do not understand—and cannot accept—why things are the way they are.
Let me summarize it like this: you may very well have the best solution in the world, but if it doesn't include a strategy for how to share it (let alone implement it), then I maintain you do not understand the problem and therefore cannot claim it is simple.
Let me summarize it like this: you may very well have the best solution in the world
I have no solution at all, for the 35th time.
This conversation is over; it's clear I'm not going to get what I asked for. If someone could answer my question, they would have by now, rather than throwing one smoke bomb after another.
Er, I sort of do think that's how it works? The ultimate rebuttal to "you can't do X" is to actually do X. Until you do that I think that ultimately the burden of proof falls on you. It can be very easy to imagine certain tasks and systems can be automated - especially when you aren't actively involved in those tasks and systems and are unfamiliar with their intricacies.
...insert specific example of currently intractable problem...
What makes the problem intractable? We can now do both voice recognition and synthesis at human levels, and any video game programmer from the 1980s can keep some objects from running into each other.
When an emergency is declared, keep the other objects in a holding pattern and give the affected object permission to land. Then roll the fire trucks. Preferably not routing both the trucks and another aircraft onto the same runway, as the humans apparently did here.
It’s not weird that you believe automated ATC is possible. The weird thing is that you insist it’s simple.
People’s lives hang in the balance of a system built of corner cases. And you trot out radiation treatment as your metaphor? As if we didn’t royally fuck that up and kill a bunch of people at first.
The 'simple' remark was in response to your wide-eyed implication that 1000 takeoffs and landings per day is somehow a challenge for modern computing systems.
You'll lose this argument sooner or later. I just hope it happens before several hundred people find out the hard way that humans no longer have any business in a control tower. With your attitude, Therac-25 would have been seen as grounds to shut down the entire field of radiotherapy.
Your “simple” springs from your assumption that the problem is easy and anyone who disagrees is dumb. This is also why you can’t hear any of the answers others have given you. You don’t want answers. You want to be “right”.
No one thinks that the difficulty with automatic ATC is that computers have trouble counting 1000 things.
One approach that has always served me well in life is when someone appears to say something that seems obviously not true (like that computers can't count to 1000), consider whether I actually have misunderstood them.
> What makes the problem intractable? We can now do both voice recognition and synthesis at human levels, and any video game programmer from the 1980s can keep some objects from running into each other.
Great point!
It must be that despite the reliability, obvious advantages, and accessibility to "any video game programmer from the 1980s", everyone else is just choosing not to do it.
Alternatively, these things are not as simple or as reliable as you, a person who has no familiarity with the problem, assumes them to be.
The only difference between an excuse and a reason is the designator's belief as to the validity of the reason provided. You have already said you do not have the expertise required to assess validity, yet here you are doing it in order to avoid engaging with the substance of what people are saying.
If these are just "excuses" and not "reasons," then explain how you have determined them as such.
while making sure these (already heavily automated) flight systems dont get confused and kill several hundred people
Confusion is indeed a common side effect of a job done halfway.
Replying:
I'm really confused at the point you're trying to make - you declared yourself not an expert in this field, while loudly declaring it's so easy to automate.
Because we've already done harder things. 1000 takeoffs and landings per day equals a trillion machine cycles between events... on the phone in your pocket. It is an extraordinary claim, requiring extraordinary proof, to say that this task isn't suitable for automation.
Why don't you do it then? What am I missing?
I'm not qualified to do it, I didn't say I was, and in any event, I don't work for free. I'm asking for concrete reasons why it's not feasible. Spoiler: there are no reasons, only excuses.
The concrete reason your ideas won’t work is you don’t have any.
It's not my job to explain how to do it, it's your job to explain why it can't or shouldn't be done. The extraordinary claim is yours, not mine.
Remember how we installed traffic lights all over the roads and now car crashes never happen any more at intersections? Truly automation solves all problems.
Hard to respond to an argument of this quality, at least without getting flagged or worse.
I'm really confused at the point you're trying to make - you declared yourself not an expert in this field, while loudly declaring it's so easy to automate. Why don't you do it then? What am I missing?
I know this was rhetorical but the obvious answer is a complete lack of any actual ideas. “Just automate it” is a common refrain from people who don’t know how to fix the actual issues with any domain.
Remember how we installed traffic lights all over the roads and now car crashes never happen any more at intersections? Truly automation solves all problems.
> I'm asking for concrete reasons why it's not feasible. Spoiler: there are no reasons, only excuses.
It sounds like you're not asking anything at all
Just to play it out a bit, are you imagining that a pilot would be reporting a mechanical failure upon descent into busy airspace to some type of like AI voice agent, who will then orchestrate other aircraft out of the way (and not into each other) while also coaching the crippled aircraft out of the sky?
Are you imagining some vast simplification that obviates the need for such capability? Because that doesn't seem simple at all to me.
To repeatedly declare something simple to fix, but then have no idea how to fix it, and indeed to declare oneself unqualified to fix it, is kind of an astounding level of hubris.
> I'm asking for concrete reasons why it's not feasible.
The concrete reason your ideas won’t work is you don’t have any.
Are you sure they are? Loading dieline home page I get a ton of tracking blocks. Defector was better, but I still see adtech endpoints. I don't really know of any modern monetized site that isn't like this.
Yea - and what a foolproof product. Chase $HYPE, boost stock price, quietly deprecate $HYPE in favor of $NEXT_HYPE, stock doesn't correct, just goes up more
imho “tell them what you want and let them come up with a solution” is a really naive way to use these tools nearly guaranteed to end up with slopware.
the more up front design I’ve given thought to, they are usually very accurate in delivering to the point I dont need to spend very much time reviewing at all. and, this is a step I would have had to do anyway if doing it by hand, so it feels natural, and results in far more correct code more often than I could have on my own, and allows multitasking several projects at once, which would have been impossible before.
reply