There seems to be some repeated confusion in this thread over different 'ratings'. To be clear, there are three main categories of ratings (in order of accuracy and usefulness for the average user):
1. The averaged preference ratings given by a large group of actual listeners in controlled, double-blind (bias-free) listening tests such as those conducted by Harman and their leading acoustic scientists in the field i.e. Sean Olive et al.
2. The predicted preference ratings calculable using Sean's formula, based on either:
a) Harman's own measurements, using their specific equipment and methodology, which the formula was devised from and so will be most accurate with.
b) Professional third-party measurements using similar equipment (i.e. only differing in the artificial pinnae used, as Harman's is custom made), and methodology (i.e. averaging left/right channels and multiple re-seats to mitigate against positional variation), such as Oratory's.
c) Third-party measurements with further differences in equipment (e.g. using a different coupler), and methodology (e.g. only a single measurement with no averaging).
3. Uncontrolled, sighted (bias-prone), non-level matched subjective ratings from single users.
Up until now, in this thread I have been specifically talking about ratings of type 1, and indirectly their tension with some people's individual type 3 ratings / impressions of the DT990. Differences between type 2 and 1 (or even within type 2a, b and c) ratings are not the issue here, and just act to obfuscate the matter.
Surprisingly (especially for a science forum), it seems a lot of people here are giving primacy to others' type 3 ratings over scientifically controlled type 1 ratings. Let me be clear, I do
not even take type 1 ratings as gospel, let alone the less accurate type 2. But the fact is, type 1 ratings are limited in the number of headphones Harman tested, so in lieu of type 1 (and type 2a), type 2b ratings are the next best predictors of headphone sound quality we currently have, and they now cover a huge number of models thanks to people like Oratory.
Back to the DT990, Sean Olive's scientifically conducted study showed that when biases are removed via blind listening, on average actual people rated (type 1) its frequency response as 91/100, as I
posted previously, from
this presentation of Sean's.
An average listener preference rating this high corresponds to 'excellent' under Harman's definition (from the 2018 paper):
To graphically explore the broad relationship between the frequency response of the headphones (left/right channels are averaged) and their subjective preference rating we plotted the average frequency response of headphones that fell into four distinctive categories of sound quality based on their preference rating: Excellent (90-100% preference rating), ...
(Note: to be clear, the DT990'S rating of 91 and the others given in the presentation linked above are
not predicted, type 2a ratings from the formula as some have confused them for - they are real, type 1 listener ratings. How do we know this? Just take a look at headphone 6 and 30 for example - their ratings of 39 and 28 respectively on slide 35 clearly match their position on the 'preference rating' axis in the graph on slide 29, and not their higher 'predicted preference rating' values.)
So what could explain the difference between some users' type 3 impressions of the DT990 and their 'excellent' type 1 rating? Firstly,
the evidence is compelling that HP26 from the study is indeed the DT990, any claim otherwise requires equal or better counter-evidence, which no-one has provided. The same goes for insinuations of an error in Sean's paper, which has no foundation. In addition to the ubiquitous, inescapable potential subconscious biases that apply to every single listener's type 3 ratings / impressions ('trained' or otherwise), I provided three other potential influencing factors in
my original post on this issue:
(i) Unit variation
(ii) Some users' actual pinnae response not being adequately represented by Harman's custom pinnae
(iii) Some individuals listening at higher volumes than the 85 dB average level used in Harman's blind tests
The similarity between the measurements here and Oratory's would suggest (i) is unlikely. As I said in my previous post, Todd Welti's measurements of headphones using his custom pinna compared to in-ear mic measurements of real people show very low error, so it seems (ii) is also unlikely (although those measurements were only really valid up until ~500 Hz so there may still be differences in treble response). (iii) still seems like a strong possibility, it being evident that some listen at much higher levels than others. Looking again at the differences in error response (from the Harman target) of Oratory's measurements of the DT990 with new and old pads, likely due to their fast rate of compression over time compared to other brands, has made me think there might be another factor at play here.
DT990 error response (
new pads):
DT990 error response (
old pads):
I suspect that old DT990 pads may impart similar acoustic differences to the headphone's frequency response as new pads when worn on a larger / wider head, as in both situations the pads are more compressed than otherwise, moving the driver closer to the ear, as well as likely decreasing the permeability and so increasing the seal the pad creates, which would result in increased bass response and extension as seen with the old pads above. In
Welti's paper he says the left/right flat plate measurements he took "were separated by 15.5 cm to match the breadth of a typical human head." He doesn't say in the paper but I would think the GRAS 45CA would be of similar width. So someone with a head width greater than this 'typical' 15.5 cm may experience the DT990 with new pads closer to the measured sound profile of the headphone with old pads as seen above, due to higher effective clamping force compressing the pads, therefore a greater seal and relatively more bass (which can also be perceived as reduced treble). Conversely, someone with a head width less than 15.5 cm may experience even less bass level / extension (and so relatively greater treble) due to lower clamping force and so less pad compression. One caveat to this is Rtings'
frequency response consistency measurements for the DT990, which show low variation in bass across the 5 subjects they tested with in-ear mics. However, I have my reservations about their (seemingly somewhat arbitrary)
level-matching and 'crossfading' (as they put it) of the in-ear mic and HATs portions of the measured frequency response, which could mean the DT990's higher inconsistency in the treble range may in fact translate to inconsistency in the bass with different level-matching / 'crossfading' choices.
TL; DR: bigger head / lower volume => DT990 good; smaller head / higher volume => DT990 bad (?)