Here, the situation is far, far better than using SINAD. The work by Sean Olive is 100% based on double blind controlled testing to make sure the score correlates with listening test results. This is no simple measurement from 50 years ago that SINAD is. Here are some bits from the paper:
A Multiple Regression Model for Predicting
Loudspeaker Preference Using Objective
Measurements: Part II - Development of the Model
Sean E. Olive, AES Fellow
View attachment 45625
Correlation of 1.0 means perfection. That the model predicts listening test results with 100% accuracy. Here is the actual graph relative to results:
View attachment 45626
As you see the experimental results closely hug the linear prediction.
View attachment 45627
And how the tests were conducted:
View attachment 45628
If this kind of scoring is not good enough for you all, I don't know what is. This research is a gift. I highly suggest reading it in detail before scuffing at it.
And it is not like we can go without. I just measured another speaker that i will post soon. I am sitting here, seeing some anomalies in the measurements but no way of characterizing it at all in relative scale to what I have measured before.
At the end, the scale may just be good for showing the best and the worst. This is what SINAD is doing and is a great service and outcome. I don't care if someone argues between a speaker that gets a score of 6 or 7. I care about clearly identifying the dogs and heros.
As with speaker testing, there are many reasons not to do something. Get on board to solve this problem. The consumer needs a scale. It doesn't have to be perfect. It is not like he has any scale whatsoever to use right now. A compass is not as good as GPS but it can sure tell you more or less which way to walk if you are lost.
Agreed. Sean Olive's predicted preference ratings from this paper, based on anechoic measurements, have a 0.86 correlation (1 being perfect) with the actual preference ratings of 70 speakers, of a wide range of types, sizes, prices and brands. Honestly, that's amazingly accurate, and is actually based on decades of confirmed science in acoustics and psychoacoustics, by leaders in their field, as well as solid statistics (the importance of which should not be underestimated for science). I don't think we're going to find anything better at the moment, without more actual scientific research being done. Consider that Consumer Reports ranked speakers for over 30 years using a model that produced ratings with a
negative correlation of -0.22 with actual preference ratings in blind tests, as Sean Olive also showed in this paper. So the better sounding a speaker was, the
lower they ranked it. ASR using Olive's model would be quite the improvement! And a fantastic resource for consumers. (Other metrics like THD can be measured and presented, but I don't believe they sould be incorporated into Olive's rating ad-hoc, as research shows distortion plays a very small role in subjective preferences of speakers, as Olive states in his paper.)
That being said, it has to be carefully considered that Olive's model was optimised to give higher predicted preference ratings to speaker systems with lower bass extension, as these give the highest
actual preference ratings. This is evidenced by the LFX (low Frequency Extension) variable in Olive's formula having a major weighting (close to a third) of the total preference rating. Full-range tower speakers and (even more so) speakers used with subwoofers tend to have the lowest bass extension. This leads to a problem when ranking e.g. bookshelf speakers or small monitors on their own, which are often designed to work best with a subwoofer, and will be considered by users who often already have a sub. Olive's formula will give skewed ratings for these speakers that are not full-range but will be used with a sub, which will not be representative of the actual preference rating they would receive in a 2.1 set-up.
I suggest then having two ratings, one for each of the two major set-ups: speakers used on their own, and speakers used with a subwoofer. The first set-up can be ranked using Olive’s original formula as is. The second set-up can be ranked with a maximum
potential preference rating – what preference rating the speaker would score
if it were used with an ideal subwoofer. From Olive’s formula, an ideal speaker with the maximum preference rating would need to have NBD_ON = NBD_PIR = 0 and SM_PIR = 1 i.e. zero narrow-band deviations in on-axis and predicted in-room response, and perfect smoothness in the latter. As this fixes all variables but LFX, from this we can calculate the LFX value that would produce a maximum score of 10 (thanks to
@MZKM for pointing out the max score is likely 10). Rearranging the formula this works out at LFX(ideal) = 1.16. From Olive's definition of LFX:
LFX is the log10 of the first frequency x_SP below 300 Hz in the sound power curve, that is -6 dB relative to the mean level y_LW measured in listening window (LW) between 300 Hz-10 kHz.
So LFX(ideal) = log10(x_SP(ideal)), and therefore:
x_SP(ideal) = 10^LFX(ideal) = 10^1.16 ≈ 14.5 Hz
This seems bang-on for an ideal subwoofer. A -6 dB point that low means it should easily reach 0 dB relative to the mean listening window amplitude by 20 Hz, the lower limit of audibility. Now we have our ideal subwoofer extension that would give us the maximum score of 10 for Olive’s preference rating (when paired with ideal speakers), we can plug this lower extension of 14.5 Hz (and so LFX value of 1.16) into the formula, and combine this with the actual measured narrow-band deviations in on-axis and predicted in-room response and smoothness of any measured speaker to give an accurate
potential preference rating when it’s used with an ideal sub, in order to rank speakers for the large group of people who use 2.1 systems for listening to music (or even home theatre). This can be presented in conjunction with the original preference rating using the actual LFX of the speaker measured, for people who do not use and don’t intend to get a subwoofer.