The observation that two words are anagrams if they contain all the same letters isn't a huge leap from two words that contain all the same letters are anagrams.
That being said: Sorting isn't required. (An O(kn) solution exists to beat your O(nk log k))
It's not entirely clear that the O(nk) solution actually wins because k is small in this case, and so log(k) is very small. Unless you do some careful bit packing, the O(nk) algorithm will use more memory, which could make it slower because of increased cache misses.
The O(nk) solution has fewer branches (O(1) versus O(log k)), and even if you use a naïve implementation, O(nk) will still fit into L1 (assuming n is bigger than k).
Not even close. The lexicon is 230k words, times 26 letters is nearly 6MB. The i7 has a 64kB L1 cache and a 246kB L2 cache.
I think the only way to resolve this would be to actually do the experiment. I wouldn't bet my life savings on the outcome either way, but I'll give you even odds at low stakes that sorting is faster.
Yeah, there are other solutions, such as allocating an array of 26 letters (ignoring accents) and using that to count, but sorting is easy and is generally fast enough.
That being said: Sorting isn't required. (An O(kn) solution exists to beat your O(nk log k))