More

dnw · 2026-01-07T20:59:58 1767819598

Not surprised (#5): https://news.ycombinator.com/item?id=46395714#46425529

dnw · 2025-12-29T21:21:00 1767043260

The author should try Google Local Services Ads instead of Google Ads. I think Google cannibalizes Google Ads with LSA.

tomjuggler · 2025-12-29T21:23:07 1767043387

Thank you I will look it up, never heard of those

dnw · 2025-12-29T20:54:12 1767041652

I have been using Copilot, Cursor, then CC for a little more than a year now. I have written code with teams using these tools and I am writing mostly for myself now. My observations have been the following:

1) These tools obviously improved significantly over the past 12 months. They can churn out code that makes sense in the context of the codebase, meaning there is more grounding to the codebase they are working on as opposed to codebases they have been trained on.

2) On the surface they are pretty good at solving known problems. You are not going to make them write well-optimized renderer or an RL algorithm but they can write run-of-the-mill business logic better _and_ faster than I can-- if you optimize for both speed of production and quality.

3) Out of the box, their personality is to just solve the problem in front of them as quickly as possible and move on. This leads them to make suboptimal decisions (e.g. solving a deadlock by sleeping for 2 seconds, CC Opus 4.5 just last night). This personality can be altered with appropriate guidance. For example, a shortcut I use is to append "idiomatic" to my request-- "come up with an idiomatic solution" or "is that the most idiomatic solution we can think of." Similarly when writing tests or reviewing tests I use "intent of the function under test" which makes the model output better solution or code.

4) These models, esp. Opus 4.5 and GPT 5.2, are remarkable bug hunters. I can point at a symptom and they come away with the bug. I then ask them to explain me why the bug happens and I follow the code to see if it's true. I have not come across a bad bug, yet. They can find deadlocks and starvations, you then have to guide them to a good fix (see #3).

5) Code quality is not sufficient to create product quality, but it is often necessary to sustain it. Sustainability window is shorter nowadays. Therefore, more than ever, quality of the code matters. I can see Claude Code slowly degrading in quality every single day--and I use it every single day for many hours. As much as it pains me to say this, compared to Opencode, Amp, and Toad I can feel the "slop" in Claude Code. I would love to study the codebases of these tools overtime to measure their quality--I know it's possible for all but Claude Code.

6) I used to worry I don't have a good mental model of the software I build. Much like journaling, I think there is something to be said about the process of writing/making actually gives you a very precise mental model. However, I have been trying to let that go and use the model as a tool to query and develop the mental model post facto. It's not the same but I think it is going to be the new norm. We need tooling in this space.

7) Despite your own experiences with these tools it is imperative that they be in your toolbox. If you have abstained from them thus far, perhaps best way to get them incorporated is by starting to use them for attending to your toil.

8) You can still handcraft code. There is so much fun, beauty and pleasure it in to deny doing it. Don't expect this to be your job. This is your passion.

fallat · 2025-12-30T16:15:14 1767111314

I want to say, that your comment has been the most real, aligned thing I've read in this post's comments. The articulation of what I've also seen and felt is perfect. Whoever else passes by, THIS, is the truth. What dnw has written is the honest-to-god state of things and that it does not rob you of the passion of creating.

flumpcakes · 2025-12-29T22:07:53 1767046073

> Despite your own experiences with these tools it is imperative that they be in your toolbox.

Why is it imperative? Whenever I read comments like this I just think the author is cynically drumming up hype because of the looming AI bubble collapse.

dnw · 2025-12-29T23:54:26 1767052466

Fair question. It is "imperative" for two reasons. The first, despite having rough edges now, I find these tools be actually useful so they are here to stay. The second, I think most developers will use them and make them part of their toolchain. So, if one wants to be in parity with their peers then it stands to reason they adopt these tools as well.

In terms of bubbles: Bubbles are economic concepts and they will burst but the underlying technology find its market. There are plenty of good open source models and open source projects like OpenCode/Toad that support them. We can use those without contributing (too much) to the bubble.

kakapo5672 · 2025-12-29T23:49:24 1767052164

There's a financial AI bubble for sure - that's pretty much a mainstream opinion nowadays. But that's an entirely different thing from AI itself bubble-collapsing.

If you truly believe AI is simply going to collapse and disappear, you are deep in some serious cope and are going to be unpleasantly surprised.

dnw · 2025-12-29T20:24:02 1767039842

I have seen Claude disable its sandbox. Here is the most recent example from a couple of weeks ago while debugging Rust: "The panic is due to sandbox restrictions, not code errors. Let me try again with the sandbox disabled:"

I have since added a sandbox around my ~/dev/ folder using sandbox-exec in macOS. It is a pain to configure properly but at least I know where sandbox is controlled.

resfirestar · 2025-12-29T20:39:38 1767040778

That refers to the sandbox "escape hatch" [1], running a command without a sandbox is a separate approval so you get another prompt even if that command has been pre-approved. Their system prompt [2] is too vague about what kinds of failures the sandbox can cause, in my experience the agent always jumps straight to disabling the sandbox if a command fails. Probably best to disable the escape hatch and deal with failures manually.

[1] https://code.claude.com/docs/en/sandboxing#configure-sandbox...

[2] https://github.com/Piebald-AI/claude-code-system-prompts/blo...

dnw · 2025-12-17T13:02:42 1765976562

"I believe that specification is the future of programming" says Marc Brooker, an influential DE at Amazon as he makes the case for spec driven development in the blog post.

dnw · 2025-12-15T00:47:17 1765759637

If you are on macOS it is not a bad idea to use sandbox-exec to wrap your claude or other coding agents around. All the agents already use sandbox-exec, however they can disable the sandbox. Agents execute a lot of untrusted coded in the form of MCP, skills, plugins etc.

One can go crazy with it a bit, using zsh chpwd, so a sandbox is created upon entry into a project directory and disposed of upon exit. That way one doesn't have to _think_ about sandboxing something.

atombender · 2025-12-15T03:08:59 1765768139

Today, Claude Code said:

    • The build failed due to sandbox
    permission issues with Xcode's
    Deriveddata folder, not code
    errors. Let me retry with
    sandbox disabled.

...and proceeded to do what it wanted.

Is it really sandboxing if the LLM itself can turn it off?

dnw · 2025-12-13T02:42:33 1765593753

- Download all the source code and look for vulnerabilities at their leisure.

- Depending on whether they use GH for deployments they can also introduce features to production that can help them

dnw · 2025-12-13T02:40:14 1765593614

Last week I accidentally exposed my OpenAI, Anthropic, and Gemini keys. They somehow ended up in Claude Code logs(!) Within seconds I got an email from Anthropic and they have already disabled my keys. Neither OpenAI nor Google alerted me in anyway. I was able to login to OpenAI and delete all the keys quickly.

Took me a good 10-15 minutes to _just_ _find_ where Gemini/AI Studio/Vortex projects keys _might_ be! I had to "import project" before I could find where the key is. Google knew key was exposed but the key seemed to be still active with a "!" next to it!

With a lot of vibe coding happening, key hygiene becomes crucial on both issuer and user ends.

ChrisMarshallNY · 2025-12-13T03:05:18 1765595118

> With a lot of vibe coding happening

I shudder to think of the implications.

Consider all the security disasters we already get from brogramming, and multiply that, times 100.

throwawaysleep · 2025-12-13T04:56:43 1765601803

Security simply doesn’t seem like it matters much based on the mild consequences.

rainonmoon · 2025-12-13T09:27:31 1765618051

Try working at a company of any remote public significance and see if your view changes.

Nextgrid · 2025-12-13T18:16:38 1765649798

There's a lot of performative "security" in such companies. You need to employ the right people (you need a "CISO", ideally someone who's never actually used a terminal in their life), you need to pay money for the right vendors, adopt the right buzzwords and so on. The amounts of money being spent on performative security are insane, all done by people who can't even "hack" a base64-"encrypted" password.

All while there's no budget for those that actually develop and operate the software (so you get insecure software), those that nevertheless do their best are slowed down by all the security theater, and customer service is outsourced to third-world boiler rooms so exploiting vulnerabilities doesn't even matter when a $100 bribe will get you in.

It's "the emperor has no clothes" all the way down: because any root-cause analysis of a breach (including by regulators) will also be done by those without clothes, it "works" as far as the market and share price is concerned.

Source: been inside those "companies of public significance" or interacted with them as part of my work.

throwawaysleep · 2025-12-13T11:16:49 1765624609

Equifax? Capital One? 23andMe? My basis for this is that you can leak everyone’s bank data and barely have it show up in your stock price chart, especially long term.

rainonmoon · 2025-12-13T11:45:07 1765626307

Stock price is an extremely narrow view of the total consequences of lax cybersecurity but that aside, the notion that security doesn’t matter because those companies got hacked is ridiculous. The reason there isn’t an Equifax every minute is because an enormous amount of effort and talent goes into ensuring that’s the case. If your attitude is we should vibe code our way past the need for security, you aren’t responsible enough to hold a single user’s data.

ChrisMarshallNY · 2025-12-13T11:56:53 1765627013

I feel as if security is a much bigger concern than it ever was.

The main issue seems to be, that our artifacts are now so insanely complex, that there’s too many holes, and modern hackers are quite different from the old skiddies.

In some ways, it’s possible that AI could be a huge boon for security, but I’m worried, because its training data is brogrammer crap.

Nextgrid · 2025-12-13T17:53:55 1765648435

Security has become a big talking point, and industry vultures have zeroed in on that and will happily sell dubious solutions that claim to improve security. There is unbelievable money sloshing around in those circles, even now during the supposed tech downturn ("security" seems to be immune to this).

Actual security on the other hand has decreased. I think one of the worst things to happen to the industry is "zero trust", meaning now any exposed token or lapse in security is exploitable by the whole world instead of having to go through a first layer of VPN (no matter how weak it is, it's better than not having it).

> quite different from the old skiddies

Disagreed - if you look at the worst breaches ("Lapsus$", Equifax, etc), it was always down to something stupid - social engineering the vendor that conned them into handing them the keys to the kingdom, a known vulnerable version in a Java web framework, yet another NPM package being compromised and that they immediately updated to since the expensive, enterprise-grade Dependabot knockoff told them to, and so on.

I'm sure APTs and actual hacking exists in the right circles, but it's not the majority of breaches. You don't need APT to breach most companies.

hulitu · 2025-12-15T09:00:40 1765789240

> the notion that security doesn’t matter because those companies got hacked is ridiculous

see Solar Winds, Microsoft etc.

ChrisMarshallNY · 2025-12-13T11:31:51 1765625511

I don't know if 23andMe has done so well, but many of their problems stem from a bad business model, as opposed to that awful breach.

I agree that we need to have "toothier" breach consequences.

The problem is that there's so much money sloshing around, that we have regulatory capture.

duxup · 2025-12-13T22:07:54 1765663674

>Took me a good 10-15 minutes to _just_ _find_ where Gemini/AI Studio/Vortex projects keys _might_ be

I feel like all this granular key management across everything, dev, life, I might be more insecure but god damn I don't feel like I know what is going on.

varenc · 2025-12-13T04:25:01 1765599901

How did they get leak them? Just someone getting into your personal Claude Code logs? I'm surprised that if it was just that Google would even be aware they're leaked.

dnw · 2025-12-13T06:05:17 1765605917

Claude was looking up env-vars during the coding session which ended up in ~/.claude/projects/ log. I wanted to make the [construction] logs public with the code. Didn't think that was a leak vector.

hippo22 · 2025-12-14T04:33:24 1765686804

How would Google or OpenAI have alerted you? Anthropic could alert you because they scraped their keys and detected on of their keys in the logs. If anything, it’s bad that Anthropic only notified you about their key, and not the other keys that have leaked.

dnw · 2025-12-15T00:32:18 1765758738

They all partner with Github to detect leaked credentials. In order to have API keys I need to have an account with each service with a valid email. So all three of them had the same information and channels available to reach me. It wouldn't have mattered how the keys got leaked, in the current setup Anthropic would have reached me first and deactivated my key.

Claude (or other LLMs, for that matter) wouldn't know they leaked the keys because I did, by trying to make the construction logs public. I just wasn't expecting the logs to have keys in them from my env vars.

dnw · 2025-12-03T15:25:31 1764775531

I was thinking this is going to happen because last night I got an email about them fixing how they collect sales taxes. Having been part of a couple of IPO/acquisitions, I thought to myself: "Nobody cares about sales taxes until they need to IPO or sell."

dnw · 2025-11-26T20:52:37 1764190357

I too am curious. My daily driver has been Claude Code CLI since April. I just started using Codex CLI and there are lot of gaps--the most annoying being permissions don't seem to stick. I am so used to plan mode in Claude Code CLI and really miss that in Codex.