How are they not SOTA? They're all very similar with ChatGPT being the worst (for my use case anyway). Like adding lambdas and random c++ function calls into my vulkan shaders.
Gemini 2.5 Pro is the most capable for my usecase in Pytorch as well. Large context and much better instruction following for code edits make a big difference.