Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
E-Reverance
43 days ago
|
parent
|
context
|
favorite
| on:
Reproducing DeepSeek's MHC: When Residual Connecti...
> Residual connections are more than a trick to help gradients flow. They’re a conservation law.
> Not a hack, not a trick. A principled constraint that makes the architecture work at scale.
jszymborski
42 days ago
|
next
[–]
OK, I thought I was reading too much into it but those same sentences also jumped out for me
roywiggins
42 days ago
|
parent
|
next
[–]
pangram thinks the whole thing was LLM generated fwiw, as dodgy as AI detectors are it is probably among the best. I don't doubt the author started with their own text, but I think it's been substantially revised via ChatGPT
DoctorOetker
42 days ago
|
prev
|
next
[–]
yes this reads like classic intellectual fellicitatio
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
> Not a hack, not a trick. A principled constraint that makes the architecture work at scale.