More

chillfox · 2026-02-26T04:24:11 1772079851

Outside of work I don't know anyone who pays for AI.

But I have noticed that everyone seems to be using ChatGPT as the generic term for AI. They will google something and then refer to the Gemini summary as "ChatGPT says...". I tried to find out what model/version one of my friends was using when he was talking about ChatGPT and it was "the free one that comes with Android"... So Gemini.

chillfox · 2026-02-25T00:56:59 1771981019

Yeah, but PayPal is an even bigger pain.

chillfox · 2026-02-20T07:32:01 1771572721

If maintainers of open source want's AI code then they are fully capable of running an agent themselves. If they want to experiment, then again, they are capable of doing that themselves.

What value could a random stranger running an AI agent against some open source code possible provide that the maintainers couldn't do themselves better if they were interested.

wolrah · 2026-02-21T05:48:09 1771652889

Exactly! No one wants unsolicited input from a LLM, if they wanted one involved they could just use it themselves. Pointing an "agent" at random open source projects is the code equivalent of "ChatGPT says..." answers to questions posted on the internet. It's just wasting everyone involved's time.

chillfox · 2026-02-20T00:25:12 1771547112

Having used AI to write docs before, the value is in the guidance and review.

I started out with telling the AI common issues that people get wrong and gave it the code. Then I read (not skim, not speed, actually read and think) the entire thing and asked for changes. Then repeat the read everything, think, ask for changes loop until it’s correct which took about 10 iterations (most of a day).

I suspect the AI would have provided zero benefit to someone who is good at technical writing, but I am bad at writing long documents for humans so likely would just not have done it without the assistance.

chillfox · 2026-02-20T00:10:28 1771546228

Bad example, you really should just write caching yourself. It’s far too little code to pull in a dependency and if you write it yourself in every project that needs it then you will get good at it, so cache invalidation bugs won’t be an issue.

scubbo · 2026-02-20T00:19:33 1771546773

Poe's Law strikes again!

chillfox · 2026-02-18T01:10:05 1771377005

Looking at https://arcprize.org/leaderboard the cost/task is about the same as Opus 4.6.

chillfox · 2026-02-16T00:14:22 1771200862

Every single new tech industry thing has to learn security from scratch. It's always been that way. A significant number of people in tech just don't believe that there's anything to learn from history.

ryandrake · 2026-02-16T00:36:56 1771202216

And the industry actively pushes graybeards away who have already been there done that.

chillfox · 2026-02-13T03:20:59 1770952859

At $13.62 per task it's practically unusable for agent tasks due to the cost.

I found that anything over $2/task on Arc-AGI-2 ends up being way to much for use in coding agents.

chillfox · 2026-02-12T02:24:22 1770863062

None of that stuff is necessary, they all get it right with the initial question and no further prompt if you dial the reasoning effort up.

chillfox · 2026-02-12T02:22:47 1770862967

They all get it right if you allow them to think.

I just copy pasted your question "The car wash is only 50 meters from my house. I want to get my car washed, should I drive there or walk?" without any further prompt and ran it against GLM 5, GPT 5.2, Opus 4.6, Gemini 3 Pro Preview, through OpenRouter with reasoning effort set to xhigh.

Not a single one said I should walk, they all said to drive.