Scaling laws apply to a single model. The best single model right now is supposedly a 8x mixture of experts, so not even really a single model in the purist sense.
I still expect the final solution will be more along the lines of picking the best model(s) from a sea of possible models, switching them in and out as needed, and then automatically reiterating as needed.
I still expect the final solution will be more along the lines of picking the best model(s) from a sea of possible models, switching them in and out as needed, and then automatically reiterating as needed.