It's not just N=24 because those same participants took 9 independent tests which generates more data. Also, the p-value should be considered along with the N, since the sampling variance of the test statistic will go up with a small N and this is accounted for in the test thresholds, which somewhat offsets that limitation.
But, a larger N and replication studies are needed.
But, a larger N and replication studies are needed.