• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Acoustical Large/Small Rooms: A Matter of Statistics

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,657
Likes
240,870
Location
Seattle Area
In acoustics there is an important topic called "Large or Small" rooms. A large room is defined as one where all the reflections are sufficiently random that there are no room modes/resonances in low frequencies. Large performance and concert halls fall in that category. All home listening spaces are considered "small" no matter what we normally call small in our lay terminology.

The above has led to confusion and folklore created on the Internet/forums that response of home listening spaces is never "statistical" or "random." After many such arguments on forums, I decided to write a long post on Gearslutz forum that finally settled the matter :). This is that post. The original argument was caused by Ethan saying he has reverberation in his large garage and people disputing that. So you will see measurements and references to his data.

The article is theoretical in nature but lays the foundation for why there is such a thing as "transition frequencies" where room modes become unimportant.

----

Theory Behind Statistical Response of Rooms

Sabine has been called the father of modern acoustics with his discovery of room response being a function of its “reverberation time.” It has been said in this thread and elsewhere that such a concept does not apply to “acoustically small” rooms which includes all the spaces we work and play in. Let’s examine the theory behind so called “large room acoustics” and see if it applies to our smaller rooms just the same.

If we excite a room with an impulse (infinite amplitude, zero time) signal, we get this response [Polack 1988]: h(t) = rand(t) * exp(-dt) for t> 0

In English, this says that our room response is a random number multiplied by an exponentially declining value. The latter should make sense intuitively. We charge the room with energy (the impulse) and then the energy drains away. The level of drop is determined by the energy still left in the room so it drops quickly at the start and then slows down (think of how fast water drains from a bathtub). The “time constant” (“d”) for this decay is proportional to the reverberation of the room: d = 6.9/RT60. So that part is easy. The not so easy part to comprehend is that random function. Why it is there and whether it plays an important role.

A basic theory of signal processing states that the response of any system can be determined by how it responds to an impulse signal. If the above represents our impulse response, then multiplying it by any source signal would make it sound like it was played there. If you have heard of “convolution reverb,” that is what I just described.

Moorer in his paper, About this Reverberation Business, demonstrates the critical nature of the random function in our room transfer function. He tested using a constant in front of the exponential decay and none sounded convincing as a reverb until he plugged in a white noise by chance and was startled how good it sounded. So clearly we need both components to be present here: the random function modifying an exponentially decaying energy in the room.

How on earth does the orderly process of sound coming out of a speaker and bouncing around the room becomes random? Let’s drill into that.

Randomness
As you now the sound that comes out of the speaker bounces around the room and combines with itself and other reflections. The summed value is a “complex” value in that each reflection has amplitude and phase (timing). The phase change is a function of reflections paths and therefore is independent of the phase of other reflections with differing paths. There is an interesting concept in statistics called the Central Limit Theorem (CLT) which says that if you have enough independent events combining, the result becomes random with “normal” distribution (e.g. Gaussian). As I just explained, our reflections are independent in nature from others so all that we need to now establish is that we have many of them. If we have a lot then we get to play in our sandbox using statistics rules as opposed to computing every path for every reflection for every point in the room.

Achieving Randomness in Frequency Domain
Let’s start by referring to the work of another father of modern acoustics, Schroeder. In a series of paper in 1960s, he set out to determine at what point the room modes become so dense as to their combination approaching randomness. Out of that came the famous equation for the transition frequency: Fc = 2000 * sqrt(Tr/V). Tr is our RT60 time. V is volume. Taking Ethan’s garage as the example, Fc = 238 Hz. What this says then that below 238 Hz our room modes do not sufficiently overlap as to be considered random but above are. In reality there is no single frequency. This is a transition region where modes go from being sparse to being very dense. And density is our friend when it comes to randomness.

How does one get to that formula though? The key to it is a thing called “modal overlap.” Simply put, this metric tells us how many modes are piling on in a single Hertz of our frequency response. Once this value approaches 3, then there is sufficient “mixing” of modes in frequency domain as to qualify to be random.

To get there, we first need the bandwidth of each mode. Yes, modes have bandwidth. They are not the single frequency values you see in mode calculators. Schroeder derived the bandwidth (- 3dB points) of each mode to be = 2.2/Tr. In other words, the larger the reverberation time, the narrow the bandwidth of each mode is. For Ethan’s garage, the RT60 is roughly 1.7. Plugging that into this formula gives us the Mode Bandwidth (“delta f”) of 1.3 Hz. In other words each mode on the average takes up a little more than 1 Hz in Ethan’s garage.

The next bit we need to figure out is how many modes we have per Hz which is called the Modal Density: Dm = 4*pi*V*F^2/c^3. F is the frequency of interest. C is the speed of light and V is the volume. In other words, the density of modes is directly proportional to volume which is why “large rooms” are thought to automatically have this random behavior. But also note that Dm also highly depends on frequency as it is squared in the formula. Here is a nice graph that shows this dependence:

i-sHVd4GC-M.png


Let’s plug in Ethan’s Schroeder’s transition frequency of 238 Hz into the above formula. It gives us a mode density of 2.2. In other words, we have a bit more than two modes per single Hertz. Yes, a single frequency source tone excites more than one mode in the room at this frequency.

If we now multiply the mode bandwidth by mode density we get our target modal overlap. For Ethan’s garage, this amounts to 2.8. In developing his formula Schroeder assumed a 3:1 overlap which is darn close to our computed 2.8.

Now let’s plug 500 Hz into the formula. The mode overlap skyrockets to 12 because of what I mentioned: modal density is proportional to square of frequency. So if 238 Hz was our entry into the random world of room acoustics, then 500 Hz is clearly way in there. As it turns out, our interest in RT60 measurements is to determine speech intelligibility which also has frequencies of interest above 500 Hz. So this works out well for us!

Another way to examine this point is to compute the Mean Free Path [Czuber, 1884]. This is the average length of each reflection. The formula is amazingly simple: 4V/S with V being the volume again and S the overall surface (I am assuming fully reflective here). Plugging in Ethan’s garage into this we get 2.9 meters or 9.4 feet. Our RT60 was 1.7 seconds meaning that is how long it took for it to mostly die off (-60 dB). Computing how many reflections we had on the average shows an amazing 201 bounces! That is a lot of reflections bouncing around the room and certainly contributes to our randomness and feeling of “real reverb” per Moorer.

Achieving Randomness in Room Response in Frequency Domain
Frequency and time are two sides of the same coin. So we should be able to perform the same analysis in time domain and show that the reflections start to overlap each other as time proceeds and hence create randomness there. Alas, the work here is not as well developed as in frequency response. The idea here is to determine the parameter Tmixing which separates the early and late reflections of our room response in time domain. The earliest and most often quoted formula for this says Tmixing is simply equal to SQRT(V) [Reichardt, Lehmann, and Polack]. This formula has been debated though with others saying it is twice as high as square root of volume or some more complicated formula. People still rally around Polack showing it to be the old concept of square root of volume. But this proof was not mathematical like that of Schroeder but rather, based on subjective perceptual concepts of human hearing system. So I don’t think it makes sense to mix this concept with the objective one in frequency domain.

Further bothersome is the notion that in time domain we don’t care about RT60 but we do in frequency domain. Frequency and time are inverse of each other so if frequency depends on RT60, time better too. Turns out we don’t need to settle this debate. As I will show later, and you may already know how, we can separate the early and late reflection regions visually from time response of our rooms. One way or the other though, time response also becomes random due to overlapping of reflections at some point and when combined with our frequency domain, give us this region of interest[Jot et. al, 1997].

i-TTstFps.png


Note that nothing in this graph talks about small or big rooms. Those two thresholds exist regardless of room dimensions and give us a region of statistically random response.

Analyzing Measurements
OK, enough theory. Let’s see if we can figure this out by looking at the impulse response, in this case that of Ethan’s garage:

i-bJZgbqH-L.png


At time zero we our “impulse” (inverse FFT of swept log in reality), and then a region of early reflections followed by late. Prior to time zero we see some interesting spikes. Those are harmonic distortions which show up in negative time due to the swept sine scheme in REW.

Let’s test our measurement for exponential decay. The graph is in dB so an ideal exponential decay would look like a straight declining line. While we can all visually see the linear declining response it would be good to have some computation to go with it that dials out the noise. There are different smoothing schemes with the most well-known being the Schroeder integral. It is a reverse integral of the RMS power of the response.

Note how the Schroeder integral suddenly terminates. This is due to smarts in REW where it attempts to establish the point at which noise dominates our total response (Lundeby et. al 1995]. For the purposes of determining the RT60, we want to only see the room response and not the residual noise. But real rooms have noise and as the level of reflections gets low enough the contribution from noise becomes significant. REW attempts to approximate this point but it is just that: an approximation so we still see the noise pushing out the integral from being a pure exponential decay.

The blue line parallel to the integral is our Topt line which is one of the way REW uses to measure the RT60 time. It is a non-standard technique that attempts the best line match as opposed to the strict T20 and T30 schemes. Simply put, the line gives us the slope of the integral which by definition is our RT60. The segment used is past the initial section but way above the noise level. REW shows a 1.5 second time for Ethan’s garage here.

I have placed a cursor at ~100 msec. We see that the integral deviates from a straight line prior to that point but we don’t care since that is the domain of early reflections and we don’t use RT60 to assess that part of the room response. So now you see why I said we don’t need to determine Tmixing.

Let’s eliminate the frequencies below 300 Hz and over 700 Hz and we get this graph:


i-Z76mCcg-L.png


Now we see a near perfect match. The rising tail in the Schroeder integral is gone which tells us the dominant contribution in the noise floor was low frequencies. This makes sense of course since low frequencies are the hardest to block and hence bleed into our spaces. We don’t always hear them usually since our ears are not very sensitive in that region but of course the mic picks them up.

So we have our exponential decay. But how about randomness? Well take a look for yourself. Don’t those spikes look random? If they were not, they would not be dialed out by the integral and the integral itself would start to twist. Don’t believe me? Let’s look at the response of a dead room that was posted here a couple of days ago:

i-tN8WxfZ-L.png


Look at the bottom of the impulse response. We clearly see a pattern there now instead of the smooth random decay in Ethan’s. What on earth caused them? We can get some clues by looking at the waterfall (not one of my favorite measures but is good here):

i-stLGDmT-L.png


I have on purpose set the limits so that we see what is in the noise floor. The waterfall is set to 1.0 seconds and the low frequency rumble continues well past it. So we know that is our problem. I suspect this is some kind of mechanical resonance set off by low frequency tones. Fortunately again, we don’t aim to analyze low frequency modes with RT60. We have a much better tool in the form of frequency response measurements. So the fact that we have distortion here is not of concern. If we had a large space, we may have had the same problem there just the same.

Note how our eyes are able to perform an “autocorrelation” function better than any computer, telling us there is a pattern there that is not random. This is important as much of the advice you see in formal texts assumes you are looking at a single value computed RT60 and not such an expressive graph. Examination of the Schroeder integral quickly tells us if we have reliable and comparable RT60 values or not. The criteria is not in this context whether the room is large or small. If the response is as classic impulse function then we have achieved our goal.

BTW, here is the higher frequency, 1000 Hz octave of the dead room [the graph incorrectly says 1000 Khz]:

i-W4Kw8bV-L.png


We nicely confirm that our problem was that of low frequencies. Moving up in frequencies as Schroder predicted, gives us the randomness we need and eliminated the low frequency noise as a nice byproduct. Yes, the integral still has some distortion in it but the Topt line still nicely estimates its slope. You can visually confirm the same by comparing it to the reflection decay itself, looking past the random impulses. Around 1000 Hz, it does take 0.17 seconds, give or take 10 to 20%. This agrees with the subjective view expressed by that poster that his small room is “dead.”

Do we have real reverb or not?
Did we answer the main question? I think we have. We have a response in Ethan’s garage that above certain frequency has excellent decay and randomness. The only difference between his room and a larger one would be that frequencies of interest extend lower.

Steady State vs Dynamic Response
An impulse response is a dynamic excitation of the system. Our source signal vanished, leaving us with its “reverb” trail. Much of the text in this area talks in the context of steady state signals. That is what triggers the discussion around critical distance and such. There, ratio of direct to indirect are computed and thresholds created of where the reverberant field has the most influence. That does not apply to impulse response which by definition has no direct sound at time > 0. Since RT60 in our modern days is computed from the impulse response, then it makes no difference what our critical distance was. We computed our RT60 with zero dependency or opportunity for distortion in that regard.

In real life, we play dynamic content that also has ups and downs. Take speech for example which can easily have swings of 10 o 15 dB between consonant and vowels. Since room response is delayed, then the reverb trail from previous peaks will compete more successfully with lower level source signals. As a way of reference, the definition of “way past critical distance” is 10 dB drop in our direct signal. Per above, we can easily get that during speech. So if we were at critical distance for the peaks, we could easily find ourselves “way past critical distance” just 0.1 second later when the vowel finishes and the consonant starts.

All of this said, yes, there is a difference between small and big room in that in a big room you are much more likely to sit far away from the source so RT60 plays a very big role there. In smaller rooms we sit close to the direct sound to so the extent its level at any time is higher than reflections, then we don’t get to ignore direct sound contributions. Therefore speaker performance is very important as is what we do we early reflections. None of this invalidates however use of RT60 to determine how live or dead our room generally is and how it might contribute to speech intelligibility (in more borderline situations) and feeling of spaciousness. It is for these reasons that you see experts such as Dr. Toole using RT60 measurements. It is a smaller leg of the stool but not one that should be dismissed out of hand.

Summary
Small rooms can and do develop statistical characteristics where the response can be shown to be both random (above certain frequency) and have proper exponential decay. Contrary to stated beliefs, RT60 measurements using modern techniques does not depend on critical distance. It can be visually verified to be correct and will most definitely represent the true decay time of signals in the room. RT60 does not have a big role in small room acoustics but is a useful measure to attain especially if one is contemplating adding more absorption in the room to deal with other issues.


References

”Frequency-Correlation Functions of Frequency Responses in Rooms,” SCHROEDER, 1962.

“On Frequency Response Curves in Rooms. Comparison of Experimental , Theoretical, and Monte Carlo Results for the Average Frequency Spacing Between Maxima,”
SCHROEDER and KUTTRUFF, 1961

“Room acoustic transition time based on reflection overlap,” Jeong, Brunskog, and Jacobsen, 2010

“Uncertainties of Measurements in Room Acoustics,” Lundeby, Vigran and Vorl¨ander, 1995

“Estimation of Modal Decay Parameters from Noisy Response Measurements,” Karjalainen, Antsalo, Akivirta, Peltonen, V¨alim¨aki

“About this Reverberation Business,” Moorer, 1978

“New Method of Measuring Reverberation Time,” Schroeder, 1965

“Sound power radiated by sources in diffused field,” Baskind and Polack, 2000

”Master Handbook of Acoustics,” Alton, 2009

"La transmission de l'énergie sonore dans les salles," Polack, 1988

”Analysis and synthesis of room reverberation based on a statistical time-frequency model,” Jot, Cerveau, Warusfel
 
Last edited:

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,250
Likes
17,188
Location
Riverview FL
The next bit we need to figure out is how many modes we have per Hz which is called the Modal Density: Dm = 4*pi*V*F^2/c^3.

That's the first Acoustic equation I've noticed that includes c. Cubed, too...
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,250
Likes
17,188
Location
Riverview FL
You actually read all that?

Yes, I'm an Outlier.

Mathematics is our window into how physical things work, whether or not those processes actually know anything about math. So, I try, even if I fail (which often leads to try again, later) to understand what the really smart people see.

I'm anxiously waiting for someone to come up with something better than FFT, which I don't like when it looks at an On/Off/On/Off signal and tells me there's an (almost) infinite series of higher frequencies in there. I don't blindly believe its analytical conclusion in that case, so it makes me a bit suspicious about other cases. Not that I don't find it useful.

Back on topic, my room decay is around 300ms. 19 x 14 x 9 rectangle, with the left rear corner open to the galley and wardroom.
 

Phelonious Ponk

Addicted to Fun and Learning
Joined
Feb 26, 2016
Messages
859
Likes
216
" I decided to write a long post on Gearslutz forum that finally settled the matter :)."

--Your optimism is inspiring, sir.

Tim
 

DonH56

Master Contributor
Technical Expert
Forum Donor
Joined
Mar 15, 2016
Messages
7,894
Likes
16,707
Location
Monument, CO
Pretty sure c in that equation is the speed of sound, not of light... Some references use v for velocity and V for volume to help avoid confusion about c.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,250
Likes
17,188
Location
Riverview FL
Pretty sure c in that equation is the speed of sound, not of light... Some references use v for velocity and V for volume to help avoid confusion about c.

Yeah.

If it were speed-o'-light, using feet per second for velocity, then the denominator would be around 9.4719762e+26, and all results would be about 0.000000000000000000000001 or so for audio frequencies.

This link comes right out and calls c speed of sound...
 
Top Bottom