It's the basso continuo that tends to set the rhythmic foundation in Baroque music. It's possible that the room modes are smearing this.
I would say this is actually most likely the root cause. When you listen to specific test signals one can hear the "rhythm" of high-Q room modes very well. High-Q in the end means there is a long settling time to a new equilibrium state whenever the input to that resonance changes amplitude or phase.
This test signal is a sine which has short envelope "dropouts" in a periodic fashion (~500 milliseconds), a fast and smooth dive to zero amplitude and back, over a few periods of the sine with a raised cosine envelope (or "time window"). After the zero point the sine either continues to run with the same phase or with a 180deg phase offset (inverted). This covers two prominent corners of state changes, a small one and a very large one. It's a really unforgiving hardcore test signal, be warned ;-) ...
Using a set**) of those signals in 1/24 octave spacing in a range of 20Hz....400Hz exposes both the the steady state signature (strong amplitude peaks or throughs/notches) as well as the rhythmic distortion/smear that transients (fast state changes) undergo. When feeding a mono signal to two speakers of a stereo setup, this additionally unveiles amplitude and phase L/R differences at the listening position to an almost extreme highlighting. Finally, by introducing a global polarity flip to one of the output channels this now specifically tests the buildup of resonances for L/R *difference* signals which exite different modal patterns than the mono *common* portion of signals to all speakers.
The experienced difference of speaker vs. headphone playback is gross, typically, in all these regards. Even with well treated rooms. It really seems a mystery why we can still enjoy speakers in a room and even prefer it to headphones. Today I'm more inclined to think moderate amounts of "room patterning" actually help perception rather than disturbing it. One example is perceived pitch of notes. I find it always easier to recognize frequency pitch changes with speakers, with no exception as of yet. Fretless electric bass players, anyone? Intonation control always is much better for me with speakers, that's where I notice the effect very clearly.
**) I once made a nice set of those test files for intuitive use as a live test CD, need to see if I can find and upload those again, the original server where they could be downloaded is gone now...