@Pdxwayne and others, I redid the test with your test files, but this time gradually EQ'ing up the 1250Hz area to see how many dB's I need to add to the Harman Curve at that point in order to reliably detect the 1250Hz tone, or at least the difference between the 2 files. I was using my HD600 EQ'd to the Harman Curve apart from adding a 1250Hz Q2 +7.5dB filter. This is the same thing as listening to a -52.5dB 1250Hz test tone mixed in with your 0dB 250Hz tone - whilst imagining you are testing it on a Harman EQ'd headphone. Here are my results, I passed all of them, it's possible I could have reduced the +7.5dB filter at 1250Hz a bit more and still detected it, but not by much, I don't think I'd do much better than a -55dB 1250Hz test tone. It's true that I was able to see the effect of raising & lowering the playback volume to best accentuate the 1250Hz tone, and I did that prior to the test to get in the ballpark best playback volume.
foo_abx 2.0.6d report
foobar2000 v1.6.7
2021-10-25 16:06:46
File A: 250hz_h2_to_h4_120_h5_60.wav
SHA1: b604c1acb2c2cdc3fd9bceba6aef1358de8d1a6e
File B: 250hz_tone.wav
SHA1: 9bedd7cc84145a54cc2979ec4bbf86d51f0f082a
Output:
Default : Primary Sound Driver
Crossfading: NO
16:06:46 : Test started.
16:12:56 : 01/01
16:13:29 : 02/02
16:13:47 : 03/03
16:14:07 : 04/04
16:14:25 : 05/05
16:14:47 : 06/06
16:15:09 : 07/07
16:15:25 : 08/08
16:15:25 : Test finished.
----------
Total: 8/8
p-value: 0.0039 (0.39%)