Two key words: Target and precision.
Target is neutrality and precision is how far away you are from that target.
Both Genelec and AES Fellow John Watkinson have made great contributions to our understanding of that target. Both argue for the point source ideal. My experience tells me they are on to something. Implicitly most weight is on speakers, less on (already transparent) enough electronics.
My own experience tells me that the size of the drivers (and of course the headroom of the supporting electronics) matter. Size matters. And I believe it’s better to operate with a margin of safety than no or too little margin (i.e. rather too big than too small drivers, rather more power/headroom than too little power/headroom).
Precision around a target of neutrality applied around well-sized drivers created a great playback vehicle.
A great playback vehicle should NOT have as a target or ideal to create emotions. Emotions and audio are two separate things.
But I know that there are people who may demur from this. They, I think, would say that there is no logical reason for it to work like that: Two point sources are not going to recreate the original sound field, so maybe something more is needed. They hope that this multi-dimensional undefinable 'something' may already have been stumbled upon by the practitioners of the ancient craft of speaker building.
Their method for rendering the quest scientific is to take existing speakers and play them at listeners in controlled experiments and ask them to rate them. This is science apparently - but maybe the hypothesis part is somewhat flimsy...
I agree that the simplest possible approach you outline above is pleasingly elegant (often indicative of 'rightness'), was what Blumlein originally had in mind(?) in his patent, and seems to work outstandingly well in my non-scientific, holistic experience.
N.B. 'Simplest' does not mean simple hardware. The simplest system is the one that linearises everything as much as possible, which may require complexity in the hardware.