MRC01
Major Contributor
...
Let me make one last argument. At the moment, we've arbitrarily broken this 21-trial test into 3 x 7-trial tests. Why not keep going, and break it down into 21 x 1-trial tests? What are the consequences of that according to the logic you're applying to the 3 x 7-trial tests?
If you agree with me that treating any n-trial test as n separate tests is the wrong approach, how can you justify in this case choosing 7 as the relevant number of trials in each sub-test? It seems completely arbitrary to me. And then impossible to avoid falling into a problem similar to Zeno's Paradox...
Yep, that's the same conundrum that I had in mind for the "independent events" approach.
... Let us try this thought experiment. ... Does this make sense?
I agree the "aggregate the trials" approach seems reasonable. Here's the conundrum it leads to. Perhaps we can resolve this.
Scenario: On day 1, Mary takes an ABX test consisting of 7 trials and gets 5 correct. On day 2, she takes another ABX test consisting of 8 trials and gets 5 correct. On day 3, she takes another ABX test consisting of 6 trials and gets 5 correct.
All 3 tests were properly performed and had the same selections for A and B.
Now the test agency doesn't tell Mary how many she got right. They only tell her the confidence scores of 77.3%, 63.7% and 89.1%. She asks us, what is the overall confidence of her results across all 3 tests? According to the "aggregate the trials" approach, it is impossible for us to compute this. To do that we must know the test details: the trials and scores of each test.
So the conundrum is: we know the confidence of each test, yet we can't compute their joint probability? That doesn't seem right. Why can't we compute the joint probability? Event 1 is "Mary was guessing on test 1", its probability is 22.7%. Event 2 is "Mary was guessing on test 2", its probability is 36.3, etc. These events are independent and their joint probability should be .227 * .363 * .109 = 0.9% chance that Mary was guessing on all 3 tests, or 99.1% confidence that she was not guessing on at least 1 of the tests.
Last edited: