Performers look similar to Labda Networks in characteristics, but they don't mention it in the paper (although Lamda Networks are used to model images). I wonder what are the main differences in the ideas.
Anyways, congratulations for beating Moore's law again!
A number of groups worked on this at the same time and of course because it is machine learning, everyone had to come up with their own name for it. Even though most of these are basically elementary linear algebra dressed up by fancy language and lots of computationally expensive experiments.
You are right, but at the same time it's great that finally it seems that we have an efficient unified model for language and image modelling. Having CNNs and RNNs separately was relatively hard to work with (especially RNNs). It would be bad if we would need different hardware accelerator architectures for the two most important / different fields of AI. Also having a unified architecture will help with mixed tasks.
Anyways, congratulations for beating Moore's law again!