Hacker Newsnew | past | comments | ask | show | jobs | submit | otterley's commentslogin

Might the conclusions be correct even if some of the facts are not? Even a stopped clock is right twice a day. And, "approximately correct" is still sometimes valuable.

I think the court dropped the ball here. On the one hand, I think they were right that using existing works--copyrighted or otherwise--to train a model was transformable fair use. On the other hand, Anthropic and others trained their models on illicit copies of the works; they (more often than not) didn't pay the copyright holders.

There's a doctrine in Fifth Amendment law called "fruit of the poisonous tree." The general rule is that prosecutors don't get to present evidence in a criminal trial that they gained unlawfully. It's excluded. The jury never gets to see it even if it provides incontrovertible evidence of guilt. The point is to discourage law enforcement from violating the rights of the accused during the investigative process, and to obtain a warrant as the Amendment requires.

It seems to me that the same logic ought to be applied to these companies. They want to make money by building the best models they can. That's fine! They should be able to use all the source data they can legitimately obtain to feed their training process. But if they refuse to do so and resort to piracy, they mustn't be allowed to claim that they then used it fairly in the transformative process.


I mean, that is what the court said! Training on pirated data was not fair use. Training on legally acquired data is fair use.

Anthropic legally acquired the data and re-trained on it before release.


Nothing today; but in a democracy, we have the power to make it possible, if people vote the right way.

The '90s was a bit too soon for that. Most people using the Internet then were still on dialup, to the extent they were connected at all. There weren't that many DDoSes yet. Even the Trin00 DDoS in 1999 only involved 114 machines.

Via which front end? It can’t be the NFS one.

The unit of granularity for a CoW filesystem is a block, which is typically 4kB or smaller. The unit of granularity for S3 is the entire object or 5MB (minimum multipart upload size), whichever is smaller. The difference can be immense.

Did you need to make this blog post 20 pages long and have AI write it? Especially in such dramatic style?

Remember the golden rule: if you can't be bothered to write it yourself, why should your audience be bothered to read it ourselves?


Sounds like it affects every open TCP connection, not just OpenClaw. (It's pretty rare for a TCP connection to live that long, though.)

Individual TCP connections don't need to live that long. Once a macOS system reaches 49.7 days of uptime, this bug starts affecting all TCP connections.

> Once a macOS system reaches 49.7 days of uptime, this bug starts affecting all TCP connections.

Current `uptime` on my work MacBook (macOS 15.7.4):

    17:14  up 50 days, 22 mins, 16 users, load averages: 2.06 1.95 1.94
Am I supposed to be having issues with TCP connections right now? (I'm not.)

My personal iMac is at 279 days of uptime.


According to the post:

$ netstat -an | grep -c TIME_WAIT

If the count it returns keeps growing, you're seeing a slow leak. At some point, new connections will start failing. How soon depends entirely on how quickly your machine closes new connections.

Since a lot of client traffic involves the server closing connections instead, I imagine it could take a while.

It's unclear if it'll leak whenever your mac closes or only when it fails to get a (FIN, ACK) back from the peer so the TCP_WAIT garbage collector runs. If it's the latter, then it could take substantially longer, depending on connection quality.


    % netstat -an | grep -c TIME_WAIT | wc -l
       1

You want to drop the wc -l.

Mac `grep -c` counts lines that match, so it always prints 1 line, so piping to wc -l will always return 1.

Or just open up and do netstat -an |grep TCP_WAIT and just watch it. If any don't disappear after a few minutes, then you're seeing the issue.


They probably aren’t affected because the buggy code was only added in macOS 26:

https://github.com/apple-oss-distributions/xnu/blame/f6217f8...


Ouch - "every Mac" from the original post is a hallucination then.

I can live with the writing style when the topic is interesting (here it was for me) but complete untruths are much worse.


You can run `sysctl kern.boottime` to get when it was booted and do the math from there.

I also can't reproduce. I want to say I have encountered this issue at least once, yesterday I before rebooted my uptime was 60 days.

But it's not instant, it just never releases connections. So you can have uptime of 3 years and not run out of connections or run out shortly after hitting that issue.


I'm just going from the bug description in the article, but it seems that depending on your network activity, the exact time you will actually notice an impact could vary quite a bit

if it's in keepalive or retransmission timers, desktop use would mask it completely. browsers reconnect on failure, short-lived requests don't care about keepalives. you'd only notice in things that rely on the OS detecting a dead peer — persistent db connections, ssh tunnels, long-running streams.

> 17:14 up 50 days, 22 mins, 16 users, load averages: 2.06 1.95 1.94

> Am I supposed to be having issues with TCP connections right now? (I'm not.)

If my skim read of the slop post is correct, you'll only have issues on that machine if it hasn't spent any of that time asleep. (I have one Macbook that never sleeps, and I'm pretty sure it hit this bug a week or two back.)


Sure they do. They need to live until torn down.

They almost never do live that long, for whatever reason, but they should.


I meant that having a connection live that long isn't necessary to trigger this bug. I know that for some workloads, it can be important for connections to live that long.

Obviously, OpenClaw is now more important than anything else.

For OpenClaw this bug is a security feature

Don’t ask the LLM to do that directly: ask it to write a program to answer the question, then have it run the program. It works much better that way.

But for lisp, a more complex solution is needed. It's easy for a human lisp programmer to keep track of which closing parentheses corresponds to which opening parentheses because the editor highlights parentheses pairs as they are typed. How can we give an LLM that kind of feedback as it generates code?

That's a different question than the one you asked. Are you saying LLMs are generating invalid LISP due to paren mismatching?

That's what the comment I was originally replying to was saying.

If the LLM is intelligent, why can’t it figure out on its own that it needs to write a program?

The answer is self-evident.

Are they also measuring productivity? Measuring only token costs is like looking only at grocery spend but not the full receipt: you don’t know whether you fed your family for a week or for only a day.

I'm not one of those execs, I'm just echoing what they tell us from those I've talked to who manage these dashboards and worry about this. I do think measuring productivity is not very clear-cut especially with these tools.

They do "attempt" to measure productivity. But they also just see large dollar amounts on AI costs and get wary.

My company is also wary of going all in with any one tool or company due to how quickly stuff changes. So far they've been trying to pool our costs across all tools together and give us an "honor system" limit we should try not to go above per month until we do commit to one suite of tools.


First you have to figure out HOW to measure productivity.

(Output / input), both of which are usually measured in money. If you can measure both of those things--and you have bigger problems if your finance department can't--it logically follows that you can measure productivity.

Measuring strictly in terms of money per unit time over a small enough timeframe is difficult because not all tasks directly result in immediately observed results.

There are tasks worked on at large enterprises that have 5+ year horizons, and those can't all immediately be tracked in terms of monetary gain that can be correlated with AI usage. We've barely even had AI as a daily tool used for development for a few years.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: