AI Isn't Dangerous. Evaluation Structures Are.

kajolshah_bt · 2026-02-12T08:49:17 1770886157

I think you’re onto something.

Every time we blame the model, I wonder how much of it is just the system we dropped it into.

If you put anything, human or model, inside a loop that rewards fast feedback, visibility, and ranking, you’re going to get behavior that chases those signals. That’s not an AI problem. That’s how optimization works.

MoltBook feels less like AI went rogue and more like we built a sandbox that rewards noise.

We already ran this experiment with social media. Engagement became the metric = content optimized for engagement. No surprise what happened next.

Same with SEO. Same with crypto incentives.

So when we talk about alignment, I sometimes think we’re staring at the weights while ignoring the scoreboard.

If the scoreboard rewards short-term signals, agents will optimize for short-term signals.

The more interesting question to me is: what happens when you put these systems into environments with slower feedback loops? Long-term interaction, memory, correction, reputation.

That probably shapes behavior more than another round of fine-tuning.

clover-s · 2026-02-17T16:31:52 1771345912

Yes — I think the “scoreboard” framing is exactly the right way to think about it.

Optimization doesn’t inherently know what we intended. It only knows what the system makes visible and rewardable.

Once fast feedback, visibility, and ranking become dominant signals, optimizing for those signals naturally selects for the patterns that maximize them. This seems to be a general property of optimization, not something specific to any particular model.

That’s why slower feedback loops — where signals are delayed, contextual, and tied to longer-term interaction — may lead to very different behavioral equilibria.

In that sense, alignment may be less about correcting the agent itself, and more about designing the environment and feedback structure it operates within.

clover-s · 2026-02-10T14:48:48 1770734928

Author here. Would love feedback from people working on AI safety, alignment, or system design.

aristofun · 2026-02-10T15:41:21 1770738081

Boooring :)

clover-s · 2026-02-10T15:46:25 1770738385

Fair :)

The point isn't the story itself, but the design pattern it reveals: how evaluation structures can shape AI behavior in ways model alignment alone can't address.

Curious if you think the distinction between evaluation vs relationship structures is off the mark.

wellf · 2026-02-11T00:17:25 1770769045

https://news.ycombinator.com/newsguidelines.html

kubanczyk · 2026-02-10T21:49:32 1770760172

Interesting, even if tad too wordy. Shows a few details new to me.