1. WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required as is 20 years of participation in forums (not all true). Come here to have fun, be ready to be teased and not take online life too seriously. We now measure and review equipment for free! Click here for details.

Limitations of blind testing procedures

Discussion in 'Psychoacoustics: Science of How We Hear' started by oivavoi, Jan 13, 2017.

Thread Status:
Not open for further replies.
  1. oivavoi

    oivavoi Addicted to Fun and Learning

    Joined:
    Jan 12, 2017
    Messages:
    505
    Likes Received:
    135
    Thanks. Yap, I find A/B tests much easier to do.

    But I for my personal use I actually don't see much point in doing blind testing. I'm a measurements guy: I buy the gear which objectively confirms most closely to the objective ideal of high fidelity. For me, at the moment, that implies phase and time coherent loudspeakers with a relatively flat frequency response and low distortion (which necessitates active crossovers), and a good polar/power response so that the reverberant field in the room becomes tonally correct. With electronics I go for affordable and well-designed no-nonsense products with low distortion, all using balanced connections. So far, it's just about achieving high fidelity in an objective sense. The only place where subjectivity comes into play for me is concerning speaker directivity, and equalizing the system at the end according to what subjectively sounds good to me.

    The only place where I can see myself using blind testing in the future, is if I should come across something that sounds "strange" or unnatural, even though it measures well. Then I might listen blind to see what it's really about.
     
    Last edited: Apr 18, 2017
    Sal1950, BE718, Thomas savage and 2 others like this.
  2. Jinjuku

    Jinjuku Senior Member

    Joined:
    Feb 28, 2016
    Messages:
    429
    Likes Received:
    105
    I've gotten no response because no one will participate

    You are free to substitute opinion with, you know, actual data.
     
  3. SoundAndMotion

    SoundAndMotion Member

    Joined:
    Mar 23, 2016
    Messages:
    72
    Likes Received:
    10
    Location:
    Germany
    By the way, before you try to send me cables, or deride me as a "believer", I should tell you I don't believe in cable burn-in, but I do believe in good scientific methods.
    And I doubt anyone ever will. Why should they? I doubt it is one of their goals to jump through your hoops.

    Note: there are flaws in you plan. I’m working from your description in post #180 in this thread. Is that complete? Before describing the flaws, let me know where the most complete description is.

    My reason/opinion on why it is an inappropriate example: the McGurk effect is an effect of multisensory integration. The brain wants info about different sensory modalities to agree with each other about what is happening around us. When there is a mismatch, the brain uses tricks to try to create a uniform percept, or it regards the mismatch as occurring from different processes. When our visual system (eyes) tell us a sound source “ought to” sound like “ga” (from mouth shape and motion, etc.), but the ears hear “ba”, the brain does it best to combine and you perceive “da”.

    “Blind listening” isn't about the eyes. It’s about knowledge. If you wear a blindfold and I tell you “this is cable A” and ask you which cable it is, it is not a blind test. It does not involve multisensory integration.

    You know, actual data on what Jakob mentioned:
    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4958963/
    http://journal.frontiersin.org/article/10.3389/fpsyg.2014.00407/full
    http://link.springer.com/article/10.3758/s13423-015-0817-4
    https://hal.archives-ouvertes.fr/hal-00941306/
     
    oivavoi likes this.
  4. Jinjuku

    Jinjuku Senior Member

    Joined:
    Feb 28, 2016
    Messages:
    429
    Likes Received:
    105
    I'm not going to send cabling to anyone that doesn't want them. Although if I was secure in my beliefs and offered some cash I wouldn't hesitate to make an easy buck.

    I have offered monetary compensation. Never asked anyone to jump through hoops without some carrot.


    Most methods have flaws and I'm intellectually honest enough to admit that and welcome constructive critique. Post 180 stands on it's own merits. I would 'burn in' the cabling with music.


    I didn't say it's a perfect example. It's a quick example that does deliver the point that you can't always trust the ear/eye/brain interaction. It is appropriate in the context that people are using multisensory integration to evaluate an audio only application.

    While a cursory read of your papers does show that people can be trained (albeit not to 100%) against the effect, closing your eyes (blinding) certainly pulls into focus what is actually heard with out all the, as you called it, 'hoops' to jump through.
     
    Last edited: Apr 18, 2017
  5. hvbias

    hvbias Active Member

    Joined:
    Apr 28, 2016
    Messages:
    194
    Likes Received:
    51
    Location:
    moderator editing/deleting posts- I'm out
    Very well said and I agree with everything in your post.

    I will add that I find blind listening tests helpful where objective design goals are tested on a statistically significant group of people, with hopefully some trained listeners present. Something like Harmon or Philips Golden Ear Training is worth going through rigorously. For instance one area that personally helped me out in narrowing down what to look for in speakers is Toole's research/listening tests on controlled directivity. Using purely the measurement method an objectively well designed speaker with flat on axis response and ignoring polars would lead (in my opinion/for my criteria) to a suboptimal purchasing decision. That is the best example of an additional objective design criteria that comes to my mind, I am sure there are others floating around upstairs :D I see you did mention directivity in your post, but to many speaker designers, flat on axis is still considered "objectively good enough".
     
    BE718 and oivavoi like this.
  6. SoundAndMotion

    SoundAndMotion Member

    Joined:
    Mar 23, 2016
    Messages:
    72
    Likes Received:
    10
    Location:
    Germany
    LOL. :-D
    I realize you wouldn’t just send them out… I wanted to nip in the bud anyone telling me to just take you up on your offer… thinking I “hear” burn-in.
    Oh sorry, I didn’t realize this. How much? (Curious, not wanting to do it.)

    It’s good you know that flawless or perfect are not usually possible. But that belies your confidence when you state (multiple times) that no one has taken you up on your offer and you state (multiple times) that no one has poked (or polked, as you joke) holes in your method.

    I notice 3 holes that have varying degrees of difficulty to repair. First, although you just mentioned using music, the rest of your burn-in protocol is unspecified. You also don’t mention how you’ll confirm that cables are “virgin” and weren’t out for a 30 day test with another customer. This is easy to fix. You state that you will communicate about and agree on theses issues with the test subject beforehand. Piece of cake-

    Second, trust. You are openly hostile to believers, and I wouldn’t doubt that many wouldn’t trust you to burn-in correctly, certify virginity correctly, and even really send 2 types (rather than trick them and say ha-ha afterward). There may be other, easier solutions, but I think you’d have to pair up with someone the believers trust, work together and watch each other.

    Third, and this has been mentioned before, how will you analyze? Are you really expecting to send out one burned-in set and one virgin set and have the whole thing ride on one answer? You were asked what you’d do if the answer was right and you said “it is what it is”. Really?!?

    You realize that even if you can easily hear the difference between say 2 …. speakers, for example, and I put them in acoustically transparent enclosures, and asked you to identify them, you may well get a few wrong answers (mistake, focus, attention, distraction… etc.). So the whole thing riding on one answer is way too risky for me. If, instead, you offered to send 20 burned-in/virgin sets, that could work. But would you really do that? And ensure the agreed upon burn-in and virginity protocol? Well that could work… it would take a l—-o-—n—-g time, but it’d work. Are you willing? If not this, or something similar, you seem to lack sincerity.

    No, that is the point. “Sighted” testing is not multisensory integration. It is a cognitive bias. That is why Jakob said it was not the appropriate example and I agree.

    Not quite, but close. Closing your eyes removes the conflict. It doesn’t necessarily change focus, so you can pull in.
     
  7. Jinjuku

    Jinjuku Senior Member

    Joined:
    Feb 28, 2016
    Messages:
    429
    Likes Received:
    105
    My initial offer has always been $100 to charity of the claimants choosing.

    No one to date taking up the offer is simply a hard data point even after back and forth about weaknesses and answering questions (such as offering a web cam on the burn in apparatus that could be checked on in real-time during the burn in process).


    Correct. Input from the claimant is always welcomed as it has to be since I am testing claims. If the claim it's 10 hours of burn in using FR sweep then so it is. If its 100 hours using music playback @ such and such RMS then so it is.

    I don't think you are being fair W.R.T to 'hostile'. I'm openly critical of people that don't apply critical thinking to what they are actually saying, that will not consider the view point that burned in cabling is meaningless in the context of use = burn-in in the course of listening.

    Yes really. It's data I'm after. Remember it could be 1 out of 1 answers or 1 out of 50.

    You could be entirely correct. Can I pick the speakers and can I make the claim? Do I get control of the volume knob? The material ? Then you could suggest your above method. I'm not asking for anything I haven't offered.


    I totally agree with your above assessment and have never been contrary. We are thinking along like lines.


    It removes a variable.
     
    Last edited: Apr 18, 2017
  8. oivavoi

    oivavoi Addicted to Fun and Learning

    Joined:
    Jan 12, 2017
    Messages:
    505
    Likes Received:
    135
    Interesting response. I would very much have liked to undergo some golden ear training... I have no illusion of having golden ears as of now!

    Concerning measurements and polar response: I would say that a loudspeaker that measures well on-axis but badly off-axis doesn't measure well - objectively. Do we need blind tests for getting to that conclusion? After all, much of the sound that reaches our ears in a typical room is reverberant sound. If this sound is very different from the direct sound, then the end result is objectively worse than in a case where the reverberant sound has a similar tonal quality to the direct sound. To me, this just seems logical :)
     
    hvbias likes this.
  9. amirm

    amirm Founder/Admin CFO (Chief Fun Officer) Staff Member

    Joined:
    Feb 13, 2016
    Messages:
    8,642
    Likes Received:
    1,214
    Location:
    Seattle Area
    Help in what regard? If it is to find out if there really is a difference, then we should strive to use research (and my own personal experience) that near instant switchovers are infinitely more reliable in finding such differences than any long term listening. I have passed many critical listening tests this way and would have no prayer of doing so with longer term listening.

    If our goal is to teach the person a lesson, then sure, we let them violate the above as much as they want. By doing so, they help our cause of embarrassing them. It doesn't help us figure out if there is any truth to their observations.
     
  10. oivavoi

    oivavoi Addicted to Fun and Learning

    Joined:
    Jan 12, 2017
    Messages:
    505
    Likes Received:
    135
    By the way, Amir: What is your take on modern well-designed dacs? Have you been able to tell differences between them?
     
  11. hvbias

    hvbias Active Member

    Joined:
    Apr 28, 2016
    Messages:
    194
    Likes Received:
    51
    Location:
    moderator editing/deleting posts- I'm out
    I have to assume Phillips were being facetious when they named it that ;)

    WRT to the off axis measurements, were any manufacturers designing speakers where this was a priority prior to Toole/Harman's blind listening tests? I could swear they started popping up after that, though this could be some sort of bias on my part of only noticing them (or more of them) after reading his research. I've corresponded with one known British monitor company that think flat on axis and low distortion are the most important aspects and I could read between the lines that they placed little importance on off axis.
     
    Last edited: Apr 19, 2017
    oivavoi likes this.
  12. oivavoi

    oivavoi Addicted to Fun and Learning

    Joined:
    Jan 12, 2017
    Messages:
    505
    Likes Received:
    135
    That wouldnt happen to be AVI, would it? ;) Sounds like them. Those are the monitors I happen to have at the moment (DM10). In the near-field, they are the best and most natural sounding speakers I've ever heard. Honestly. Their claim is that waveguides etc color the sound in a very slight way, and I think they might have a point. If so, you can have a trade-off between off-axis behavior and distortion.

    But in the far-field, I've heard speakers which behave better than the DM10s. And there's no doubt that a speaker with the DM10 sound and better off-axis behavior would be an even better speaker! And that's the kind of speaker I'm hoping to find.
     
    Last edited: Apr 19, 2017
  13. Blumlein 88

    Blumlein 88 Major Contributor

    Joined:
    Feb 23, 2016
    Messages:
    2,087
    Likes Received:
    550

    I don't know. I tried out the version that used to be online. Got to silver level pretty easily. Gold took a bit of doing. I think some of the Harman training is still online.

    Now I hardly think I am ready to be a tonmeister.
     
  14. Cosmik

    Cosmik Major Contributor

    Joined:
    Apr 24, 2016
    Messages:
    1,096
    Likes Received:
    310
    Location:
    UK
    Just found this from the 1970s. KEF certainly seem to have been thinking about the importance of off-axis sound with the 105.

    upload_2017-4-19_8-18-31.png
     
    BE718 and oivavoi like this.
  15. oivavoi

    oivavoi Addicted to Fun and Learning

    Joined:
    Jan 12, 2017
    Messages:
    505
    Likes Received:
    135
    Very interesting. Thanks. Reading this manual, I'm again amazed how much hifi companies got right back in the old days, and yet somehow these things got lost in the 80s and 90s... With this design, KEF got so many things just right, IMO. The focus on even dispersion with frequency (but without any horn/waveguide which may color the sound), the insights into the importance of both the direct sound and the reverberant soundfield, separate enclosures for the the different drivers (reduces distortion and resonances), the pyramide shape which has several advantages, the focus on adequate amplifier power to reduce clipping (since dynamic peaks requires much more power than commonly assumed), etc.

    I actually saw a Model 105 on the second hand market in Norway recently for about 1000 euro, and was very tempted to buy it and see if I could activate it using a cheap minidsp unit. I suspect that the crossover network is quite complex in this one, though.
     
    Cosmik likes this.
  16. Jakob1863

    Jakob1863 Active Member

    Joined:
    Jul 21, 2016
    Messages:
    282
    Likes Received:
    60
    Location:
    Germany
    Nothing wrong with that. :)

    Controlled listening tests for personal use can be fruitfull in helping to get further insight into your own perception. And work as an additional guard against fooling yourself, but it is obviously quite as easy to get incorrect results via "DBTs" as it is with "sighted listening" .

    There is so much to learn about the quality of reproduction chains and no one of us knows right from the beginning which level of quality is achieveable with a certain record and what reproduction system will get the most of it (meating our personal preferences) under the usual constraints.
    Some effects are easier to access within tests while others are more difficult (emotional impact for example) to grasp. We have to learn to do evaluational listening and to cover the most important points in quite short times, which also means to extrapolate from short impressions to long term effects that might occur.

    PS not to forget that conclusions drawn are usally relying on estimates of the underlying population distribution parameters. Chances are quite high that your personal preferences differ from the mean so you have imo to listen for yourself......
     
    Last edited: Apr 19, 2017
    oivavoi likes this.
  17. Jakob1863

    Jakob1863 Active Member

    Joined:
    Jul 21, 2016
    Messages:
    282
    Likes Received:
    60
    Location:
    Germany
    Help wrt to find if their is evidence in support of the claim, which was based on listening under conditions including long(er) switchover time spans.in

    As said before, it depends on the question/hypothesis under examination. And as we know the more tight the control the less the practical relevance of any result. (there is an reciprocal relationsship between level of control and, so to speak, everyday relevance)

    If we want to teach a lesson then we shouldn´t be imo interested in embarrassing anyone but in helping to get good results. It is my personal experience (and other reporting similar observations) that most likely listeners will not do very well in their first test(s) under controlled conditions, if the EUT depends on a multidimensional perceptual impression.

    Although having read literally hundreds of papers on sensory/auditory memory (and ASA as well), i have yet to find a model approach that covers all the different aspects in a convincing manner.
    So, it depends.....some are advocating to use short samples (<= 5s to avoid hopefully any influence of information in categorical storage) others promote samples of intermediate length (15-20s as in the ITU-R BS 1116-x, which is hard to argue under several model approaches mainly relying on a FIFO process) while others point out that sometimes longer samples are needed to ensure that listeners are able to access all dimensions.
     
  18. amirm

    amirm Founder/Admin CFO (Chief Fun Officer) Staff Member

    Joined:
    Feb 13, 2016
    Messages:
    8,642
    Likes Received:
    1,214
    Location:
    Seattle Area
    It has been years since I have done any listening tests of such. Putting aside the crappy ones, I don't believe there are audible differences between them based on measurements.
     
    Sal1950 and oivavoi like this.
  19. amirm

    amirm Founder/Admin CFO (Chief Fun Officer) Staff Member

    Joined:
    Feb 13, 2016
    Messages:
    8,642
    Likes Received:
    1,214
    Location:
    Seattle Area
    We cannot convince them to even have a discussion about proper way of doing the test (i.e. fast switching, level matching, etc.). So what we are left is asking them if they would do the test on their own terms.

    As to the second one, I have passed many difficult tests under stringent tests. I don't have much patience left these days :) but if pushed, I will put in the time to do it. And I can do better using such tools (e.g. ABX tests) than without, i.e. ad-hoc listening.
     
  20. SoundAndMotion

    SoundAndMotion Member

    Joined:
    Mar 23, 2016
    Messages:
    72
    Likes Received:
    10
    Location:
    Germany
    It has been said “To a man with a hammer, everything looks like a nail.”

    Amir, you are a man with a hammer. You are good with your hammer, even impressive. But not everything is a nail. Even a thin, cylindrical device 1 inch long and 1/16” wide with a flat head and pointed tip might be a nail or a wood screw. And your hammer is not a great tool for the wood screw.

    To know the right tool (measurement method, e.g. listening test), you must first know what is to be measured and why (for what purpose will the measurement be used).

    Your hammer is not universally applicable.
     
Thread Status:
Not open for further replies.

Share This Page