It is actually not difficult to understand why different cone/dome materials may sound different as long as you do not use static thinking to try to understand dynamic behaviour.
There are other things radiating sound too, like the surround, in ways not exactly the same as the excitation signal sent to the driver, anyway.
There is an excitation signal sent to the speaker which causes the driver to move, ideally in an exact motion copy of the electrical input.
At low frequencies the accuracy of driver movement is more dependant on linearity of the magnetic and electrical circuit and the linearity of the suspension (surround and spider) once the displacement is small enough for these to be working in a reasonably linear part of their range the cone/dome will follow the signal accurately up to the point where structural resonances in it start producing inaccuracies. This is the point where the cone/dome material must be influencing the sound it is no longer a question of whether it is or not but by how much.
There are two principle ways of dealing with this inevitable product of physics, one is to damp the resonant peaks to an "acceptable" level, the other is to only use the driver in its non-resonant range then cross it over to a smaller driver with its first resonance at a higher frequency.
In the frequency range before any resonance I can't think of any reason why one cone material could sound any different to any other, maybe somebody else can?
Once in a frequency range where resonant effects are influencing output I can't think of any reason one cone material would sound the same as another, it is pretty well not credible.
So what effects the resonances and their amplitude? Cone/dome shape and the mass, material stiffness and internal (and external if applied) damping.
For most of my lifetime there have been no practical and affordable materials with mechanical properties allowing operation with entirely resonance free radiation.
There are other interesting effects, one is to use the fact that thigs resonate by cleverly exciting a lot of the higher modes so there are far more but much smaller peaks. NXT and BMR drivers exploit this.
I suspect things like ribbon, AMT and Lineaum tweeters actually radiate in ways nothing like the "static thinking" explanations we have all seen!
As an aside, I have often seen it mentioned that objects in different materials sound different when struck as an explanation of why they would sound different in speaker drivers but, of course, when you strike an object it resonates and that is the frequency region loudspeaker material choice is trying to avoid, so is a false argument as long as used below any resonance, but true in the resonant radiation range, depending on damping.