1 dB is about what most of us can detect as a change in volume (the Just Noticeable Difference, JND, based on a reference in my old grad school acoustics text) so 0.1 dB to 0.5 dB is the commonly accepted threshold of level matching for DBT/ABX testing. That is, 0.1 dB is about 1/10th the JND, and provides enough margin to ensure virtually all subjects will
not detect the difference in amplitude. Someplace above that level most listeners will choose the louder source. One of many things that makes a proper audio DBT tough.
As for the measurement, remember you are not trying to set
absolute levels, but
relative levels between the two sources. Great accuracy is not really required, just the ability to make two measurements back-to-back with about 1% precision. Play a 1 kHz tone (or pink noise, there are debates about which is better) and switch between sources while measuring the output. You are not trying to match each to some absolute standard, just the change between them to about 0.1 dB (about 1% change in voltage so not really all that accurate compared to what our instruments can do). I usually used one of those big old HP RMS voltmeters (e.g., 3400, below). Nothing in the system is changing but the DAC so matching levels is not a huge task.
I have not set up a DBT in years but did a number of them a long time ago (and far, far away -- but in this galaxy
). Frankly I learned a lot about what
not to do... It is hard to set up and run a good test. We usually did a variety with different listeners, different test times, etc. Some were short snippets, some many minutes to a complete song. The tests were to detect differences, not how good or bad something sounded.
The meter:
These days there are many more options but I still like the big analog meters; wish I had kept mine!