I'm kind of curious as to why Gflops is the chosen basis for asserting performance superiority? Most user workloads exercise integer and I/O performance much more heavily. Linpack HPL evaluates CPU (not GPU) floating point performance IIRC so it's not a representative workload.
I typically run a suite of workloads (see https://sbc-reviews.jeffgeerling.com), but I like HPL as it's been a consistent relative performance metric for decades, especially for efficiency utilizing all available memory.
There's always danger to focusing on one metric or benchmark too much, but I also enjoy comparing each system or cluster I build to the historic 'top500' list, to see what decade we're in for small clusters of mini computers.