More

jpcompartir · 2026-03-24T10:59:05 1774349945

This looks like a Claude-generated SVG to me, is it not?

anhner · 2026-03-24T12:05:33 1774353933

It's 100% claude-generated html. I asked it to create some other cheat sheet for me and the template was identical.

Edit: https://news.ycombinator.com/item?id=47495528

jpcompartir · 2026-03-23T19:35:25 1774294525

There are better techniques for hyper-parameter optimisation, right? I fear I have missed something important, why has Autoresearch blown up so much?

The bottleneck in AI/ML/DL is always data (volume & quality) or compute.

Does/can Autoresearch help improve large-scale datasets? Is it more compute efficien than humans?

bonoboTP · 2026-03-23T20:58:12 1774299492

There is a field of AutoML, with its own specialized academic literature and libraries that tried to achieve this type of thing but didn't work very well in practice.

Years ago there were big hopes about bayesian hyperparameter optimization, predicting performance with Gaussian processes etc, hyperopt library, but it was often starting wasteful experiments because it really didn't have any idea what the parameters did. People mostly just do grid search and random search with a configuration that you set up by intuition and experience. Meanwhile LLMs can see what each hyperparameter does, it can see what techniques and settings have worked in the literature, it can do something approximating common sense regarding what has a big enough effect. It's surprisingly difficult to precisely define when a training curve has really flattened for example.

So in theory there are many non-LLM approaches but they are not great. Maybe this is also not so great yet. But maybe it will be.

nextos · 2026-03-23T19:44:11 1774295051

AFAIK, it's a bit more than hyper-parameter tuning as it can also make non-parametric (structural) changes.

Non-parametric optimization is not a new idea. I guess the hype is partly because people hope it will be less brute force now.

gwerbin · 2026-03-23T19:57:45 1774295865

It's an LLM-powered evolutionary algorithm.

ainch · 2026-03-23T20:07:36 1774296456

I'd like see a system like this take more inspiration from the ES literature, similar to AlphaEvolve. Let's see an archive of solutions, novelty scoring and some crossover rather than purely mutating the same file in a linear fashion.

nextos · 2026-03-23T20:50:55 1774299055

Exactly, that's the way forward.

There are lots of old ideas from evolutionary search worth revisiting given that LLMs can make smarter proposals.

UncleOxidant · 2026-03-23T21:13:51 1774300431

That was my impression. Including evolutionary programming which normally would happen at the AST level, with the LLM it can happen at the source level.

coppsilgold · 2026-03-23T19:55:10 1774295710

Perhaps LLM-guided Superoptimization: <https://en.wikipedia.org/wiki/Superoptimization>

I recall reading about a stochastic one years ago: <https://github.com/StanfordPL/stoke>

frumiousirc · 2026-03-23T20:35:22 1774298122

> There are better techniques for hyper-parameter optimisation, right?

Yes, for example "swarm optimization".

The difference with "autoresearch" (restricting just to the HPO angle) is that the LLM may (at least we hope) beat conventional algorithmic optimization by making better guesses for each trial.

For example, perhaps the problem has an optimization manifold that has been studied in the past and the LLM either has that study in its training set or finds it from a search and learns the relative importance of all the HP axes. Given that, it "knows" not to vary the unimportant axes much and focus on varying the important ones. Someone else did the hard work to understand the problem in the past and the LLM exploits that (again, we may hope).

janalsncm · 2026-03-23T21:55:03 1774302903

> The bottleneck in AI/ML/DL is always data (volume & quality) or compute.

Not true at all. The whole point of ML is to find better mappings from X to Y, even for the same X.

Many benchmarks can’t be solved by just throwing more compute at the problem. They need to learn better functions which traditionally requires humans.

And sometimes an algorithm lets you tap into more data. For example transformers had better parallelism than LSTMs -> better compute efficiency.

jpcompartir · 2026-03-24T10:54:59 1774349699

Fair push back, but I do think the LSTM vs Transformers point kinda supports my position in the limit, not refutes. Once the compute bottleneck is removed, LSTMs scale favourably. https://arxiv.org/pdf/2510.02228 (I believe there's similar work done on vanilla LSTMs, but I'd have to go digging)

So the bottleneck was compute. Which is compatible with 'data or compute'. But to accept your point, at the time the algorothmic advances were useful/did unlock/remove the bottleneck.

A wider point is that eventually (once compute and data are scaled enough) the algorithms are all learning the same representations: https://arxiv.org/pdf/2405.07987

And of course the canon: https://nonint.com/2023/06/10/the-it-in-ai-models-is-the-dat... http://www.incompleteideas.net/IncIdeas/BitterLesson.html

Scaling compute & data > algorithmic cleverness

janalsncm · 2026-03-24T17:22:36 1774372956

Algorithms do matter because compute is not unlimited in practice. Otherwise we might as well use bogo sort because the result is eventually the same. Yes the platonic ideal of a sorted list looks the same but that doesn’t tell you anything about how to get there or whether you can in this lifetime.

I bring up transformers because scaling compute and data was unlocked by a better algorithm. It matters a lot because scaling compute isn’t always an option.

hun3 · 2026-03-23T19:40:23 1774294823

> There are better techniques for hyper-parameter optimisation, right?

There always are. You need to think about what those would be, though. Autoresearch outsources the thinking to LLMs.

jpcompartir · 2026-02-27T09:21:18 1772184078

"Regardless, these threats do not change our position: we cannot in good conscience accede to their request."

calgoo · 2026-02-27T09:27:05 1772184425

Yes, that is great, for people from the US. For people in Europe and other locations, this just proves that they dont really care as the tool is already being used against us. It quite clear to me that anyone outside the US should immediately cancel all contracts with these corporations, as well as work their hardest at blocking their bots online.

jpcompartir · 2026-02-27T15:26:38 1772205998

As a non-US citizen, I'm quite glad in the knowledge that Claude won't be used to kill other non-US citizens with autonomous weapons

jpcompartir · 2026-02-21T13:07:17 1771679237

This is great, brings clear benefits to both sides and the rest of us.

Always rooting for Hugging Face

jpcompartir · 2026-02-19T22:23:10 1771539790

Yep, Gemini is virtually unusable compared to Anthropic models. I get it for free with work and use maybe once a week, if that. They really need to fix the instruction following.

jpcompartir · 2026-02-12T09:47:37 1770889657

Thanks for the long and considered response, but this is a really ugly UX decision.

As others have said - 'reading 10 files' is useless information - we want to be able to see at a glance where it is and what it's doing, so that we can re-direct if necessary.

With the release of Cowork, couldn't Claude Code double down on needs of engineers?

jpcompartir · 2026-02-11T16:55:47 1770828947

This is great, not 10 minutes before this outage did I present Railway as a viable option for some small-scale hosting for prototypes and non-critical apps as an alternative to the Cloud giants

ezekg · 2026-02-11T17:13:52 1770830032

It always happens that way. I guarantee some people migrated from Heroku to Railway and bragged about future stability to the team, only to experience this.

jpcompartir · 2026-02-11T17:25:57 1770830757

Yeah 100%

This won't change my decision, but it is still impeccable timing

jpcompartir · 2026-02-06T21:08:13 1770412093

4.6 is a beast.

Everything in plan mode first + AskUserQuestionTool, review all plans, get it to write its own CLAUDE.md for coding standards and edit where necessary and away you go.

Seems noticeably better than 4.5 at keeping the codebase slim. Obviously it still needs to be kept an eye on, but it's a step up from 4.5.

nwienert · 2026-02-06T21:34:39 1770413679

Not clearly a step up for me, it's way more hesitant it seems and I don't notice context being larger at all it seems to compact just as often.

jpcompartir · 2026-01-12T21:30:01 1768253401

I've been working with a claude-specific directory in Claude Code for non-coding work (and the odd bit of coding/documentation stuff) since the first week of Claude Code, or even earlier - I think when filesystem MCP dropped.

It's a very powerful way to work on all kinds of things. V. interested to try co-work when it drops to Plus subscribers.

jpcompartir · 2025-10-31T10:01:30 1761904890

I can't remember which paper it's from, but isn't the variance in performance explained by # of tokens generated? i.e. more tokens generated tends towards better performance.

Which isn't particularly amazing, as # of tokens generated is basically a synonym in this case for computation.

We spend more computation, we tend towards better answers.