But I don't see any difference in principle between how instruments and speakers behave in small rooms. Speakers sound better in larger rooms, just like instruments do.
I think implicit in your comment is the notion that the only important sounds we hear are the direct sounds from the instrument or speaker itself. In the concert hall, we actually hear inseparably direct + reflected sound, and most in the audience hear considerably more reflected energy than direct. This has been amply proven many, many times objectively.
Oversimplifying, large performance spaces have much, much longer sonic reflected path lengths than small rooms, introducing much more time delay or reverb in the perceived diffuse sound spectrum than small rooms. Reflections and distance from the stage also alter the frequency balance, generally attenuating the highs and adding more emphasis to the bass - so called "warmth". Different halls do this to a different degree, and some are traditionally considered acoustically "ideal" in terms of how they influence sound in the audience for orchestral music - the Concertgebouw, Symphony Hall Boston, the Musikvereinsaal, etc. Attempts by acousticians to reproduce the specific sound parameters of those halls in newly architected ones via measurement, etc. have been decidedly mixed. Nontheless, typical symphony or chamber venues have more in common than they do differences, owing to the large space and typical performance arrangement they are dealing with, though some are considered better than others.
Bottom lines:
-- it is not about speakers or instruments, it is about direct sound coupled inseparably to our ears with delayed, reflected sounds from the hall that makes a large concert venue sound different.
-- no way on earth a listening room or small venue can simulate the reflected sound field of a concert venue, even for chamber or solo music. The listening room audio path lengths are hopelessly too short with insufficient time delay of reflections and without the tonal balance contributed by reflections and attenuation in air at the greater distances in the large hall.
-- the reflected sound field in the hall is also spatially multidimensional in xyz space, and it is important to keep the spatial directions of the reflected sound field properly oriented in xyz space on playback for maximum faithfulness to the live sound.
-- stereo redirects all sound at the listener from the front only, a small fraction of even 360 degree xy space as heard in the hall and losing directional information of much of the perceived spatial sound field.
-- an excellent, viable compromise to the above is 5/7.1 discretely recorded Mch sound. While limited to only xy, not xyz, it can capture and direct the reflected sounds at the listener on playback while preserving much of the 360 degree spatial aspect of the sound in the hall via direct and phantom imaging.
-- possibly, Immersive 3D in xyz space is better still, but it has not yet been commercially viable for music recordings, unlike xy Mch which enjoys a persistent, though small, commercial niche in classical music.