Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author here: The new introduction of attention between features did make a big impact compared to the first variant of TabPFN. The old model handled every feature like it was completely different to be feature 5 vs 15, but actually features are typically more-or-less permutation invariant. So the logic is similar to why a CNN is better for images than an MLP.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: