Post-training is higher ROI right now because base models are undertrained relative to what post-training can extract from them. But post-training has a hard ceiling: it can only surface what pre-training put there.
Most frontier labs are converging on the same capability ceiling — likely because the training data is basically the same internet. If that's the bottleneck, the next real leap is a pre-training problem, not a post-training one. Another RLHF variant won't get us there.