Statistics of ABX Testing

Fitzcaraldo215 · Mar 21, 2016

Thanks, Amir. I do not disagree with any of your points in the paper cited. No matter what you do, all testing protocols on human subjects are imperfect, even with perfect, unbiased test design, which is quite hard to achieve with any methodology, as the paper shows. And, if you do not do a rigorous and accurate statistical analysis and show your work, you do not have much of true value.

fas42 · Mar 21, 2016

The most powerful tool I've found so far is Audacity - I've mentioned this before - simply import all the tracks to create a multi-track project, make sure they're well aligned - and select a short, key section that highlights differences, and leave running on continual repeat the playing of that snippet. Mute all but one track, and then at some point quickly switch the un-muting to another track. Differences leap out at you, it becomes easy peasy to pick the variations ...

John Kenny · Mar 31, 2016

ABX has strengths in differentiating amplitude/frequency differences that can be easily revealed by instant A/B switching much like if we take two swatches of colour it's not so easy to tell them apart if they are not side by side in our vision but bring them together side by side & they can be easily differentiated. So successful ABX testing does rely on this quick A/B switching of sounds still in echoic memory (the duration of which is debatable & variable depending on cognitive load).

One of the issues I see with ABX testing is that, as Amir said, if we can't identify a specific short section of an audio track that reveals the difference, we are most likely not going to identify a difference. I've seen it stated that just going with our "feeling" of a difference between tracks can reveal a difference exists but I have my doubts about the efficacy of this approach.

Identifying a specific audio section for comparison is the visual equivalent of finding a colour swatch or segment somewhere across two videos that we can use to A/B. This is not simple for video & even more difficult for audio due to the way the perceptions differ - we can freeze a video & look at frame stills for comparison - we can't do that with audio - with audio we are being asked to find the difference section in a constantly changing audio stream.

So the first problem I see is zoning in on possible areas that might reveal differences & examining them - this requires prior knowledge/experience of the sound of distortions/artifacts/issues & the discipline to search for & find such differences in the two audio streams. The other problem related to this is that we are all limited in our experience of the sound of artifacts/distortions - hence, without training in what they sound like we are limited in what we can differentiate.

The next problem I see with ABX is that more of the differentiating issues between modern audio playback are not identifiable to short audio segments but are more about quality of the perceptual illusion. You may find what I have to say next is too controversial for here & it should be in "Fight Club" so I'll just mention it here & maybe do a full exposition of my position in Fight Club. I'm of the opinion, based on my experience, that modern audio playback's typical differences at the higher end of things, are not easily identified in a short segment - it is more to do with quality differences between the illusions created by the playback systems - thing like soundstage depth, solidity, etc. (I don't want to get into too much of controversial stuff here). My main point is that to differentiate between playback versions we need to listen to a longer segment of audio to judge these elements & hence it makes ABX an unsuitable tool for evaluating these differences

Sorry if this is the wrong place to post this, Amir, - I know it's not about ABX stats but then some of the others posts aren't either - it arose from the thoughts that came to me when I read the various posts in here - feel free to move it if you think it should be elsewhere

Dynamix · Apr 2, 2016

John Kenny said:
ABX has strengths in differentiating amplitude/frequency differences that can be easily revealed by instant A/B switching much like if we take two swatches of colour it's not so easy to tell them apart if they are not side by side in our vision but bring them together side by side & they can be easily differentiated. So successful ABX testing does rely on this quick A/B switching of sounds still in echoic memory (the duration of which is debatable & variable depending on cognitive load).

One of the issues I see with ABX testing is that, as Amir said, if we can't identify a specific short section of an audio track that reveals the difference, we are most likely not going to identify a difference. I've seen it stated that just going with our "feeling" of a difference between tracks can reveal a difference exists but I have my doubts about the efficacy of this approach.

Identifying a specific audio section for comparison is the visual equivalent of finding a colour swatch or segment somewhere across two videos that we can use to A/B. This is not simple for video & even more difficult for audio due to the way the perceptions differ - we can freeze a video & look at frame stills for comparison - we can't do that with audio - with audio we are being asked to find the difference section in a constantly changing audio stream.

So the first problem I see is zoning in on possible areas that might reveal differences & examining them - this requires prior knowledge/experience of the sound of distortions/artifacts/issues & the discipline to search for & find such differences in the two audio streams. The other problem related to this is that we are all limited in our experience of the sound of artifacts/distortions - hence, without training in what they sound like we are limited in what we can differentiate.

The next problem I see with ABX is that more of the differentiating issues between modern audio playback are not identifiable to short audio segments but are more about quality of the perceptual illusion. You may find what I have to say next is too controversial for here & it should be in "Fight Club" so I'll just mention it here & maybe do a full exposition of my position in Fight Club. I'm of the opinion, based on my experience, that modern audio playback's typical differences at the higher end of things, are not easily identified in a short segment - it is more to do with quality differences between the illusions created by the playback systems - thing like soundstage depth, solidity, etc. (I don't want to get into too much of controversial stuff here). My main point is that to differentiate between playback versions we need to listen to a longer segment of audio to judge these elements & hence it makes ABX an unsuitable tool for evaluating these differences

Sorry if this is the wrong place to post this, Amir, - I know it's not about ABX stats but then some of the others posts aren't either - it arose from the thoughts that came to me when I read the various posts in here - feel free to move it if you think it should be elsewhere

Who said that ABX tests have to be short? You can set up an ABX test to last as long as you like.

And more importantly; if you are so sceptical about ABX tests, then why did you use the results of ABX tests to support your claims in another thread just the other day? Make up your mind. You either think they're valid, or you don't. Which is it?

John Kenny · Apr 2, 2016

Dynamix said:
Who said that ABX tests have to be short? You can set up an ABX test to last as long as you like.

The idea of the strength of A/B switching is the use of echoic memory & hence differences are much easier to perceive. If you deny this & start suggesting otherwise you are losing the advantage that is inherent to this quick switching. I presume you don't deny quick switching is a more sensitive technique?

And more importantly; if you are so sceptical about ABX tests, then why did you use the results of ABX tests to support your claims in another thread just the other day? Make up your mind. You either think they're valid, or you don't. Which is it?

It's subtle so try to follow me - when you get a positive ABX result & it has been investigated & found not to be positive because of something wrong with the test, then you have to accept the validity of the test.

However 99% of home run ABX test are flawed & that explains the high percentage of null results because it will almost always default to the null result.

Dynamix · Apr 2, 2016

John Kenny said:
It's subtle so try to follow me - when you get a positive ABX result & it has been investigated & found not to be positive because of something wrong with the test, then you have to accept the validity of the test.

So what you're saying is that any ABX test that backs up your preconcieved beliefs is valid, and any ABX tests who does not is flawed. OK. Because you are on record claiming that ABX tests are of no use in audio, so that's the only way I can interpret your last post.

AJ Soundfield · Apr 2, 2016

Dynamix said:
Who said that ABX tests have to be short? You can set up an ABX test to last as long as you like.

Yeah but not being able to peek causes all sorts of problem for John et al. The irony is that he claims some "effects" like timbre only show up in "long term" view-listening.
The exact opposite is true in reality, we will be able to pick out differences in timbre and spectral qualities in quick switch, in about 15 mins, we'll have adapted to those differences!
Not sure if you've seen the Dr Griesinger slide http://www.davidgriesinger.com/intermod.ppt

Dynamix said:
And more importantly; if you are so sceptical about ABX tests, then why did you use the results of ABX tests to support your claims in another thread just the other day? Make up your mind. You either think they're valid, or you don't. Which is it?

Yeah, it's tough to figure out why they are total stupidity...except when "Vital" on PFM tea-towels or some ad hoc "test" at an audio show gets the desperately desired positive-ish results John wishfully thinks for.
I think it's:
Positives for belief - ABX = good
Negatives for belief - ABX = bad

cheers,

AJ

John Kenny · Apr 2, 2016

Dynamix said:
So what you're saying is that any ABX test that backs up your preconcieved beliefs is valid, and any ABX tests who does not is flawed. OK. Because you are on record claiming that ABX tests are of no use in audio, so that's the only way I can interpret your last post.

Like I already said - it's simply two statements
- ABX tests that return a positive result which are not flawed are actually confirming that a difference can be heard
- 99% of home-run ABX tests are flawed

Blumlein 88 · Apr 2, 2016

AJ Soundfield said:
Not sure if you've seen the Dr Griesinger slide http://www.davidgriesinger.com/intermod.ppt

cheers,

AJ

Yes that is a nice powerpoint for anyone who has not seen it.

March Audio · Apr 2, 2016

Dynamix said:
So what you're saying is that any ABX test that backs up your preconcieved beliefs is valid, and any ABX tests who does not is flawed. OK. Because you are on record claiming that ABX tests are of no use in audio, so that's the only way I can interpret your last post.

Classic. I was going to add something, but you sum it up.

Subjectivists dont like being tested because they know they will fail to live up to their delusions of aural capability.

John Kenny · Apr 2, 2016

AJ Soundfield said:
Yeah but not being able to peek causes all sorts of problem for John et al. The irony is that he claims some "effects" like timbre only show up in "long term" view-listening.
The exact opposite is true in reality, we will be able to pick out differences in timbre and spectral qualities in quick switch, in about 15 mins, we'll have adapted to those differences!
Not sure if you've seen the Dr Griesinger slide http://www.davidgriesinger.com/intermod.ppt

Well if I said timbre, I may be wrong - I'm willing to admit that but I feel that there are certain characteristics of sound to which quick A/B switching is not conducive to differentiating & Griesinger agrees in his last slide "intelligibility, muddines & envelopment allmay depend on the time period devoted to listening to a particular acoustic signal"

Yeah, it's tough to figure out why they are total stupidity...except when "Vital" on PFM tea-towels or some ad hoc "test" at an audio show gets the desperately desired positive-ish results John wishfully thinks for.
I think it's:
Positives for belief - ABX = good
Negatives for belief - ABX = bad

cheers,

AJ

It's already explained a number of times in as clear a manner as I can - I can't help your understanding shortfall.

amirm · Apr 2, 2016

Post deleted. Members should note that with the creation of the Fight Club, non-cordial, argumentative talk is not tolerated in the rest of the forum.

Dynamix · Apr 2, 2016

John Kenny said:
- ABX tests that return a positive result which are not flawed are actually confirming that a difference can be heard

And what about ABX tests that return a negative result which are not flawed? Tests that are not half-arsed home tests. There are a lot of those, and the question is; are you going to keep ignoring them because they may not support your beliefs? I mean, that's what you've been doing in the past.

Again: Are ABX tests, when properly executed, valid or are they not? Regardless of the outcome. Take a stand. No ambiguity, yes or no?

March Audio · Apr 2, 2016

John Kenny said:
You may find what I have to say next is too controversial for here & it should be in "Fight Club" so I'll just mention it here & maybe do a full exposition of my position in Fight Club. I'm of the opinion, based on my experience, that modern audio playback's typical differences at the higher end of things, are not easily identified

Its not controversial in the slightest. You are articulating what has been obvious to most non subjectivists for a long time - that the differences between much of the good quality kit out there are totally insignificant. Also that subjectivists tend to greatly exaggerate those differences where they do exist, assuming that they havent been entirely imagined due to the biases of sighted listening.

John Kenny · Apr 2, 2016

Dynamix said:
And what about ABX tests that return a negative result which are not flawed? Tests that are not half-arsed home tests. There are a lot of those, and the question is; are you going to keep ignoring them because they may not support your beliefs? I mean, that's what you've been doing in the past.

but do you not also ignore null results?

Again: Are ABX tests, when properly executed, valid or are they not? Regardless of the outcome. Take a stand. No ambiguity, yes or no?

of course they are valid, in so far as they are suited to what's being tested- I never said otherwise

John Kenny · Apr 2, 2016

You are cutting off my post in mid sentence to change is meaning. This is not clever & is just meant to incite anger. It will be reported.

BE718 said:
Its not controversial in the slightest. You are articulating what has been obvious to most non subjectivists for a long time - that the differences between much of the good quality kit out there are totally insignificant. Also that subjectivists tend to greatly exaggerate those differences where they do exist, assuming that they havent been entirely imagined due to the biases of sighted listening.

Dynamix · Apr 2, 2016

John Kenny said:
but do you not also ignore null results?

of course they are valid, in so far as they are suited to what's being tested- I never said otherwise

No, I don't ignore null results, as long as the tests aren't flawed, and there is a clear statistical significance. Sure, I get the whole "you can't prove a negative" thing, but I'm still not prepared to allow subjectivists to use it as some sort of "get out of jail card". That's not how it works.

"in so far as they are suited to what's being tested". I was expecting something like that. Of course you are going to keep cherry-picking the results you are comfortable with, and denounce any results you don't like as not being "suited to what's being tested". I'd expect nothing less. And you've said otherwise plenty of times. That's why I (and apparently several others who are familiar with your Modus Operandi) was so surprised when you tried to use ABX test results as an argument a few days ago.

You could have saved yourself the trouble, and just typed "no".

March Audio · Apr 2, 2016

Dynamix said:
No, I don't ignore null results, as long as the tests aren't flawed, and there is a clear statistical significance. Sure, I get the whole "you can't prove a negative" thing, but I'm still not prepared to allow subjectivists to use it as some sort of "get out of jail card". That's not how it works.

"in so far as they are suited to what's being tested". I was expecting something like that. Of course you are going to keep cherry-picking the results you are comfortable with, and denounce any results you don't like as not being "suited to what's being tested". I'd expect nothing less. And you've said otherwise plenty of times. That's why I (and apparently several others who are familiar with your Modus Operandi) was so surprised when you tried to use ABX test results as an argument a few days ago.

You could have saved yourself the trouble, and just typed "no".

Its quite amazing, as one travels around the forums, just how much subjectivists contrive and twist to suit the information to their own ends. Ever watched youtube of Richard Dawkins V creationists?

Yes it was surprising that he chose to use an ABX test as he has rallied against them for so long.

Dynamix · Apr 2, 2016

BE718 said:
Yes it was surprising that he chose to use an ABX test as he has rallied against them for so long.

Yeah, my first thought when I saw that was "eeeh, what?"

And I must admit, I much prefered Hitchens (RIP) over Dawkins. Hitch was the man.

amirm · Apr 2, 2016

Thread cleaned.

Gentlemen, these are supposed to be comments to the articles I have written. Those articles are shown in the home page of this forum. Someone clicking on them MUST NOT see bickering, in-fighting, finger pointing, etc. They need to see people expand and add on the information, and have fun discussing it.

Whatever personal friction exists between membership belongs in the Fight Club.

If I see members just interested in fighting, I will change their permissions to only see and participate in Fight Club forum.

Statistics of ABX Testing

Major Contributor

Major Contributor

Addicted to Fun and Learning

Addicted to Fun and Learning

Addicted to Fun and Learning

Addicted to Fun and Learning

Major Contributor

Addicted to Fun and Learning

Grand Contributor

Master Contributor

Addicted to Fun and Learning

Founder/Admin

Addicted to Fun and Learning

Master Contributor

Addicted to Fun and Learning

Addicted to Fun and Learning

Addicted to Fun and Learning

Master Contributor

Addicted to Fun and Learning

Founder/Admin

Similar threads