They don't have to measure close to identically, though, do they? What's important is that the differences are inaudible, which requires a high level of performance for each. As soon as they aren't within their performance envelope - for example, if input impedance doesn't match to the output impedance of the source/preamp, or if they are unable to drive the attached loudspeaker due e.g. to insufficient power or gain.
In the case of the test amplifiers, the MA252 (a hybrid integrated amp of good performance) might struggle with the Magico S5 at it's lowest impedence, and may have been driven as if it was a power amp rather than a preamp, the test description isn't clear.
The Purifi amp, if correctly assembled, should drive the speakers with no issues.
The Benchmark amp has to be set up correctly - it has switching input impedance and gain levels - to drive the Magicos, given the expected output from the Topping in preamp mode. Even then, it may not cope.
So, there are legitimate reasons why this test may have shown differences, apart from the level matching question. It is a good idea if indeed you are chasing a "no differences" result, to use easy to drive speakers, a traditional preamp with decent gain, to check everything actually is set up to work together, and probably not to compare power amps with integrated ones. A more reasonable test volume level would help too!