I am not sure "does phase shift matter" identify s what conditions are causing the phase shift, or what frequency range it is in so I vote "it depends".
For example the phase shift one see's caused by the time of flight to the microphone effects all frequencies equally and so, does not alter the wave-shape the speaker produced and so is delayed in time but not audible.
On the other hand, the acoustic phase is what the driver did compared to the input signal when all the fixed propagation delays are subtracted. Maybe better to say that acoustic phase shift shows the degree the loudspeakers internal delays are different at different frequencies.
Dick Heyser (the fellow who first figured out a way to measure it) described acoustic phase as the change in physical depth position so far as time was concerned, the source moving forward and back.
It is this acoustic phase shift which alters or not, the input wave shape or preserves it in sound pressure, if you have two or more sources radiating, what you get is normally also location dependent as the sources are different distances depending on position. Headphones would be entirely different case than two ears. Two ears can often localize the depth / distance to the sources (and this identity is present or not while playing a stereo image).
It is my observation that if one looks at an "equal loudness curve", you see where your hearing is most acute and sensitive, most likely to be sensitive to things in this range and there is not doubt in my mind this shape is also related to your head's shape like your ability to locate position tied to your outer ear's shape etc.
Absolute phase is a tough nut too, say you have a fairly broad band but asymmetric wave shape (a 1 mic recording of a firework is an example), play this through a speaker that already has several hundred degrees of acoustic phase through that spectrum and then reverse it, you may or may not hear a difference or they may sound different but not one better than the other.
Take a simple speaker, correct the phase and mag (not the room) with FIR, listen close in the near field and you will likely hear the difference. To the degree the speaker disappears in the stereo image, the effect may be more audible.
Best,
Tom
A suitable test "signal" I made some years back, a tough signal to get right with speakers but usually good with headphones.
https://www.dropbox.com/s/eik72wzv5hptq3r/fireworks 2013 last 6 min cd.wav?dl=0