Final comment:
@voodooless You can't say that something is broken by design because it is lacking something if there's nothing stopping you from adding to it. Something is broken by design if it does not allow you to add to or subtract from it. The latter is the issue with delta-sigma dacs being that oversampling is innate to its function. With a filter-less NOS dac you get to decide what kind of reconstruction filter is used; IIR, FIR, high order, low order, etc. More so, you get to decide how much you affect the impulse response. For chip dacs, there are filters named 'NOS', but we know that it is still oversampling. It only serves so much convenience to have onboard oversampling for filtering's sake when it will be inconvenient to be left with irreversible impulse ringing as a result of that oversampling even in the case of playing high bit-rate, high sample-rate, correct ADC material into it.
These problems are really what is wrong with class d amps because all of the slew rate of the transistor is spent on switching. It is not the switching itself that is the problem, but that because it is switching it cannot be directly controlled by the input, it requires the input be probed to activate the switch. This probing frequency results in quantization error, even in the case of 'digital amplifiers'.
@danadam All your comment shows is that amirm was first to bring it up with his reply linking his video on phase, because it occurs in the video and I didn't skip it. Then
@Geert said that absolute time resolution isn't indicated by the reference I provided, that it pertains strictly to spatially. I made a comment about how the number hasn't really changed even after further investigation.
In terms of temporal resolution (not regarding difference between two ears/spatiality) there are two numbers; essentially milliseconds, but with a lot of range (0.1, 0.5, 1, 2, 3), and tens of microseconds (22.7, 25). I subscribe to the latter.
Milliseconds:
"A quick lit search gives me 1.2ms (Irwin & Purdy, 1982). Also, people were shown to be able to recognize voice when stimuli were only 2ms long (Suied, et al., 2013)!" From the reddit link below.
Hearing is one of the most sensitive of our senses, and even small issues in sound quality can interfere with listening experiences.
medicalxpress.com
Microseconds: "I would cut your 1/20,000 in half actually to 1/40,000. Just because that...", "Well I am not really aware of such studies. I can direct you to this file...", "If you create a sound file at 44100 Hz with only silence in it, and set exactly one sample...", "This sounds like a guess to me. What biological processes..." etc.
From:
https://www.reddit.com/r/askscience/comments/2cbogd
I may edit this in the future to add a picture of the waveform I described earlier.
Let's say - simply going by the tonal limit of our hearing - that 20khz, or 1/20,000th of a second, is the ceiling in terms of our ability to sense a rise time difference for a frequency like 1khz. One says this is the limit, so being served sound containing elements above it are inaudible, while I'm saying that something like 100khz can be sensed as a difference in rise time. The problem with the idea that "this is the limit, we needn't serve the ear a sound of a more sudden nature because it won't be audible" is that it's a half truth because served a sound that just meets this limit you will actually be hearing something below it given that the ear acts as an impedance to the sound energy. You might not actually be hearing that 1/20,000th sec rise time until you have a sound with 5x that to do the work.*
I have read people saying things like "sounds with a more sudden nature are perceived as louder", and I can't help but feel that in situations like this the words "sounds like" or "perceived" are implying that it comes down to psychoacoustics, but it's not that sophisticated. It is due to that waveform actually having more energy, so it becomes an issue of the kinetic transfer of energy.
Just because a sine wave reaches its peak and extends in time enough to amount to an audible frequency like 1khz doesn't mean it has filled in that time with as much energy as possible as if to do any more it would require a higher peak or a larger window of time (lower frequency).
A frequency (let's stick with 1khz) with the same peak level but a more sudden start and end in reaching and coming down from that peak packs more energy in the same unit of time than the same frequency with the same peak level having a less sudden start and stop. This is also why the fall time contributes to percussive quality. Think of it like a battering ram but in time not just space. This is why a square wave will sound more percussive than a sawtooth wave of the high attack kind, because it's not just about the attack.
This can also explain why square waves at higher frequencies still audible as tones matter less, because the difference 'filling in' the time window with energy makes becomes more and more marginal.
A high frequency like 100khz alone repeated at a high level won't have a tonal register to our ears because the energy is dispersed by being fractioned in time, there is lots of negative space, there is a lot of time where nothing is happening/there is no energy. This can explain in part why we can't hear these frequencies as tones. In the above, an ultrasonic frequency equivalent rise/fall time has the support of a tonally audible frequency, which is another way of saying it has more energy.
*This shouldn't be read as if there should be compensation for waveforms, it's to say that if you take the time distortion (affected impulse, rise & fall time) away it will sound as it should. It also shouldn't be read as if it only applies to such sudden rise times, this sudden rise time is being used to discuss the limits.
Since this got some backlash:
If we Google search period vs frequency we are told they are opposite. Really, the period is not different for being 'centered between the cycle' because the frequency energy is only occurring over that center. A helpful visual is where there is shading of the wave, which correctly indicates that there is energy and where it is. It's not the lines used to draw the symbols that are the indicators.
"Frequency refers to how often something happens. Period refers to the time it takes something to happen. Frequency is a rate quantity. Period is a time quantity."
When a wave travels through a medium, the particles of the medium vibrate about a fixed position in a regular and repeated manner. The period describes the time it takes for a particle to complete one cycle of vibration. The frequency describes how often particles vibration - i.e., the number of...
www.physicsclassroom.com
While this understanding may be relevant in other contexts, the problem here is obviously that the frequency is being abstracted, when we know that frequency can regard causality (i.e. energy), which is best understood as temporal.
To go back to the Rehdeko speaker, it isn't really known for its soundstage, extension, or ability to play busy tracks. It even has a resonating cabinet which I would say allows us to locate the speaker and detracts from spatiality. Despite this there is still good system Q/damping factor.
It's not just about spatial separation, it's about timbral separation. When I'm talking about single sounds sound summing into one sound wave I'm referring to sound awareness theory (I forget exactly what it's called but it has to do with this). Part of timbre is ADSR, and part of what's going to contribute to our sense of separation of sounds in space is timbral difference.