Hacker Newsnew | past | comments | ask | show | jobs | submit | yacin's commentslogin

Any paper like this would easily take a year or more to write and go through the submission/review/rebuttal/revision/acceptance process. I don't understand why the models being a year or two old now is worth noting as though it's a clear weakness? What should they do, publish sub-standard results more quickly?

> I don't understand why the models being a year or two old now is worth noting as though it's a clear weakness?

I do think it's a clear weakness. Capabilities are extremely different than they were twelve months ago.

> What should they do, publish sub-standard results more quickly?

Ideally, publish quality results more quickly.

I'm quite open to competing viewpoints here, but it's my impression that academic publishing cycle isn't really contributing to the AI discussion in a substantive way. The landscape is just moving too quickly.


The onus is on you to prove or at least convincingly argue that the results are unlikely to generalize across incremental model releases. In my personal experience, the overly affirming nature seems to have held since GPT-3. What makes you think a newer, larger model would not exhibit this behavior? Beyond "they're more capable"? I'd argue that being more capable doesn't mean less sycophantic.

It's certainly possible some of the new advances (chain-of-thought, some kind of agentic architecture) could lessen or remove this effect. But that's not what the paper was studying! And if you feel strongly about it, you could try to further the discussion with results instead of handwavingly dismissing others' work.


The onus of persuasion is on the persuader, and publishing a study on old models that no one uses anymore isn’t persuasive. I don’t need to prove anything to decide that you haven’t changed my mind.

By this logic there can be hundreds of studies that all show the pattern, including a 100% accurate prediction of the results for the next model and none of them would be "persuasive", because OpenAI decided to always release a new model the day before the paper is published.

So what you're saying here is that you were never open to "persuasion" and it was just a front to waste everyone's time.


I think you are absolutely right. (had to)

Capabilities are not the same thing as personality.

Upgrading a robot that knows how to lay bricks to one that also knows how to lay plaster won't make it a better therapist.


Common Lisp in particular is multi-paradigm. You can write a ton of code and never use recursion once. I doubt bridging this "gap" was in any way difficult.

this has to be the first of many right? fingers crossed this leads to some meaningful change.

You mean it's the first of many appeals, I assume.

Trial courts will decide pretty much anything. Then the case gets appealed over whether the trial court correctly interpreted things you probably perceive as uncomplicated, like the 1st Amendment.


It's a huge deal because it was the bellwether case for over 1,000 other similar cases.

ah yup:

> It comes on the heels of a Delaware court decision clearing Meta’s insurers of responsibility for damages incurred from “several thousand lawsuits regarding the harm its platforms allegedly cause children” — a ruling that could leave it and other tech titans on the hook for untold future millions.


I wonder at which point do children become such a liability for platforms that it's easier to just ban all children altogether.

Children don't have disposable income to buy ads/subscriptions. They don't have experience to write about. The only thing they have that adults don't is time which translates into engagement metrics.

In an ideal world, the adults that buy/manage the computers would create age-restricted account for children, and the OS would give this information to the browser, which would just transmit it via HTTP. This is the safest method to verify ages. If an operating system doesn't want to support this, it's ultimately the adult's responsibility to install one that supports it. This would mean there would be no burden on the adults (the majority of the planet) to verify their ages, so there would be no burden on the platforms to restrict ages either.

If platforms could verify ages without inconveniencing their main user base, I wonder if platforms would just start banning all minors, or if there is some reason to allow minors in the platform that justifies all the liability surrounding them.


Children are an extremely valuable ad target.

They have their hands directly on their parents heart strings, and their parents have a credit card.

This isn't anything new, think about the toy ads we had on TV when we were young.


I guess you are right. I assumed that something like Youtube Kids would have no ads at all given the audience, but it seems it does have ads targeted at young children. Bleak world we live in.

Nobody takes “age-restricted account[s] for children” seriously.

Parental controls and age-restrictions are almost universally half-baked, buggy fig leafs to displace negative attention from software and content providers.


Yep. The insurance covers accidents and negligence, not deliberate decisions to impose harm to children for financial gain.

Sounds too good to be true. I’ll hold my breath.

I too am in "Sloath Pose"

maybe it's just from being covered in Faygo?

Faygo is unironically delicious. They used to sell them for $1 a pop (Midwestern pun intended) on the East Coast in gas stations. Diet varieties of Orange, Moon Mist, and Root Beer were personal favorites.

No idea whether this is still the case as I haven't been in a Sheetz in years.


Most grocery stores still sell Faygo in Michigan. But you rarely see more than the most popular 3 or so (boring) flavors. I remember there being at least a half-dozen different Faygo flavors at every kid's birthday party in the 80's.

you just called rock 'n rye a boring flavor??

it's one of the regulars here i'd say. in my head i see cola/rye/cream as the "always available" at convenient stores etc


Yeah I guess you're right.. Rock-N-Rye and Cream Soda are actually pretty great.

Sheetz quit carrying Faygo long ago, at least in and around Philly.

does this knob to make number go up ever _not_ work?


It always works. I wonder if CEOs build up bloat as a tool to use.


wish there were an option to disable the annoying startup messages with emojis when using the library.


also just another clear ripoff. they copy, they acquire, but they cannot seem to innovate.


Meta seem fairly innovative. Their r&d labs seem to produce some really cool things.

The basic issue is none of it seems to be making any money when it ends up in products and services.

Main reason there seems to be that their walled garden approach is tolerated at best. They just aren't very good at it outside of a feed.


Some things that Meta shares or opensources is discrete but amazing. lz4 and zstd and Yann Collet's work. io_uring (don't know if Jens Axboe is still there). And the open timecard projects, and overall OCP work.


yeah, i'm referring to product innovations, but i should've made that explicit.


should probably just ban gambling for children but seems like a good first step.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: