Measuring the "sound signature" of two different integrated amplifiers.

GXAlan · Oct 3, 2022

-Matt- said:
In one place you lost a further factor of 10. I assume there may be some typos. Would be good if you could check through the op and correct if necessary.

Yeah, in System C, I took the second amplifier out of the tape out and that messed up the volume so I had to turn the knob again because the tape out and SACD out had different voltages.

But System B vs C have a PK value of -93 dB so turning the volume knob didn't generate a meaningful difference in sound. That is also seen visually with the software matched volumes.

restorer-john · Oct 3, 2022

GXAlan said:
That's the thing. This wasn't about testing the gear at specific volts. It was me sitting down to listen to music in my office on one system, then testing an amplifier my brother sent me and listening to the same music and then hearing a difference and being very surprised.

All sampling was done from speaker terminals.

Literally, this is what a real person would do. Attach a CD/SACD player to an integrated amp, and then a set of speakers and then they'd listen to the music at the level they want to.

System A and C are exactly that scenario.
System B is the weird one, but once I had volumes matched at 0.056 dB I dared not change anything since there was a real risk I'd never be able to match it again. It was just how I hooked up the amp when I first tried it.

I get that and I understand what you are trying to convey.

My point is specifically about how the volume (level) controls on amplifiers do a whole lot more in many cases than merely raise or lower the level.

They change the relative and residual noise levels. They change the frequency response and many change the distortion profile. That is why you need to remove the effects of the volume control from the equation. That is why you sometimes can hear differences that may not exist when you level the playing field.

At maximum level a traditional pot is essentially a short between the wiper and the input to the next stage with only the fixed total value of the taper being the load to the source, plus the input impedance of the next stage. An electronic volume control can be wildly different and when varying gains are in play, again, no useful comparisons can be made. You could lower the input level and increase the volume to obtain the same output, and get a different response and perceived sound difference. But it wouldn't be valid.

dualazmak · Oct 3, 2022

McFly said:
and still you'd need a disc with 1khz 0db on it to calibrate.

The "Super Audio Check CD" from CBS/Sony, containing super-high precision test signals, would fit for these tests and calibrations. If you would be interested, please refer to my post here. You may find the PDF booklet (English translation by myself) of the CD there. If you would be seriously interested, please simply PM me writing your wish.

GXAlan · Oct 3, 2022

restorer-john said:
I get that and I understand what you are trying to convey.

My point is specifically about how the volume (level) controls on amplifiers do a whole lot more in many cases than merely raise or lower the level.

They change the relative and residual noise levels. They change the frequency response and many change the distortion profile. That is why you need to remove the effects of the volume control from the equation. That is why you sometimes can hear differences that may not exist when you level the playing field.

At maximum level a traditional pot is essentially a short between the wiper and the input to the next stage with only the fixed total value of the taper being the load to the source, plus the input impedance of the next stage. An electronic volume control can be wildly different and when varying gains are in play, again, no useful comparisons can be made. You could lower the input level and increase the volume to obtain the same output, and get a different response and perceived sound difference. But it wouldn't be valid.

I agree 100% with what you're saying. This is why I posted these set of measurements.

But I think you're getting too technical and less practical. In System A or B, how do I level the playing field if I'm an ordinary consumer?
I cannot remove the effect of the volume control if I have a disc player, an integrated amp, and a speaker pair.

Maybe this is the Rosetta Stone of why precise measurements look the same, subjective experience says it sounds different sometimes.... it's all in the volume control that doesn't get examined under the microscope the way everything else might.

restorer-john · Oct 3, 2022

Bear in mind, I'm from the anti-ASR position where I know, can hear and can demonstrate that amplifiers DO sound different to one another.

Not all of them and often ones you would not expect to sound different.

But if you are going to go down that path, you need to address all the technical points the so-called experts will throw at you in an attempt to discredit your conclusions.

Sokel · Oct 3, 2022

restorer-john said:
I get that and I understand what you are trying to convey.

My point is specifically about how the volume (level) controls on amplifiers do a whole lot more in many cases than merely raise or lower the level.

They change the relative and residual noise levels. They change the frequency response and many change the distortion profile. That is why you need to remove the effects of the volume control from the equation. That is why you sometimes can hear differences that may not exist when you level the playing field.

At maximum level a traditional pot is essentially a short between the wiper and the input to the next stage with only the fixed total value of the taper being the load to the source, plus the input impedance of the next stage. An electronic volume control can be wildly different and when varying gains are in play, again, no useful comparisons can be made. You could lower the input level and increase the volume to obtain the same output, and get a different response and perceived sound difference. But it wouldn't be valid.

The big question here is,is that results justify the differences some people hear (and we mock them for that) in real world conditions?
Plain and simple,are they valid?

solderdude · Oct 3, 2022

Also... speaker loads are not the same as resistive loads.

One could, for instance, do the same measurements once with a resistor as a load and once using a speaker and null those. This could potentially give you the influence a speaker has on the amplifier output.

Sokel · Oct 3, 2022

solderdude said:
Also... speaker loads are not the same as resistive loads.

One could, for instance, do the same measurements once with a resistor as a load and once using a speaker and null those. This could potentially give you the influence a speaker has on the amplifier output.

The point of this test is not being universal.
If this guy was giving the subjective impressions of a comparison we would maybe mock him,but as it seems we would be wrong.

GXAlan · Oct 3, 2022

solderdude said:
Also... speaker loads are not the same as resistive loads.

One could, for instance, do the same measurements once with a resistor as a load and once using a speaker and null those. This could potentially give you the influence a speaker has on the amplifier output.

Right, but you would expect that you have a GREATER chance of seeing differences with real speakers as opposed to a resistive load. The harder it is to show a difference, the stronger the difference must be.

restorer-john · Oct 3, 2022

Sokel said:
The big question here is,is that results justify the differences some people hear (and we mock them for that) in real world conditions?
Plain and simple,are they valid?

They may well be perfectly valid. And they may not.

But nobody should mock anyone who claims to able to hear the differences in amplifiers. That is just misguided, ignorant and mean.

GXAlan · Oct 3, 2022

Sokel said:
The point of this test is not being universal.
If this guy was giving the subjective impressions of a comparison we would maybe mock him,but as it seems we would be wrong.

Thanks, it's exactly that -- it's not universal. It's definitely a very limited scenario that won't be applicable to everyone, but I heard a difference that I was surprised to hear. So, I thought I'd try to see if it's measurable and at least for this case, it seems to be. There are so many variables that it's only fair to call the comparisons "Systems" because any number of variables can be the responsible element. I thought it was crazy that even when matching to 0.056 dB for System A vs. System B, we can measure differences that meet the standard for audibility with a precise test setup that seems to have good consistency.

To me, this is the first time I've seen (taken) measurements that give some sort of credibility to the "sound signature" of amplifiers at low volumes.

DeltaWave is a great tool that's free. It helps us quantify differences by correcting subtle volume differences and shifts in (Thanks @pkane x2)
PK Error metric is a new invention. It helps us quantify differences in a quantitative way that wasn't available before whether or not something is different (Thanks @pkane)

PK Error Metric | DeltaWave documentation

Version 2.0 of DeltaWave. Use at your own risk!

deltaw.org

The PK Metric is really the key to analysis that didn't exist before (along with good ADCs)

TDLR:
1. If I measure the same thing over and over, my PK Metric is -120 dB or so.
(Made possible with technology of good ADCs)

2. If I measure the same amp with a short versus complicated signal path, the PK metric drops, but it's still -93 dB
(Can state measured difference is meaningless, made possible with good software)

3. If I measure two "systems" that sound different to me, the PK Metric drops to -48 dB which crosses the threshold of audibility which is stated as -50 dB.

WHY they are different is beyond any of these tests.

@restorer-john has the best comment so far -- volume control knobs are really complex. That really could be the rosetta stone between measurements and subjective experience. We all listen at different levels because we have different rooms, different speakers, and different musical preferences. Even the gain needed to hit your preferred volume is dependent on your recording.

Sokel · Oct 3, 2022

restorer-john said:
They may well be perfectly valid. And they may not.

But nobody should mock anyone who claims to able to hear the differences in amplifiers. That is just misguided, ignorant and mean.

I'm not amongst the ones who mock,I have zero technical background to justify such behavior even if a was just mean or wanted to show off myself and get the approval of the (desired) community.
Just checking my own sanity

-Matt- · Oct 3, 2022

How about running a comparison of the recordings against the source material to see which ~~amp~~ system is more transparent?

solderdude · Oct 3, 2022

Sokel said:
The point of this test is not being universal.
If this guy was giving the subjective impressions of a comparison we would maybe mock him,but as it seems we would be wrong.

You simply can't test amps 'universal' as every speaker behaves differently and amp designs may react differently to back EMF or complex loads.
Therein lies the problem.

One could test with a 'specific' complex load everyone would going to be using and even simulate back EMF but that still does not rule out possible audible differences with other loads.

-Matt- · Oct 3, 2022

Shouldn't nonlinearity of the peak response be captured as distortion in Amir's normal test sequence? (Especially in the multitone test).

Sokel · Oct 3, 2022

solderdude said:
You simply can't test amps 'universal' as every speaker behaves differently and amp designs may react differently to back EMF or complex loads.
Therein lies the problem.

One could test with a 'specific' complex load everyone would going to be using and even simulate back EMF but that still does not rule out possible audible differences with other loads.

(in case I'm missing the point):This test demonstrated how two "blue" region amps sounds different.
The "why" is that,can be a ton of technical issues,but specific methodology seems sound.
And in the specific system proved with measurements that's differences are audible besides the odds for that.
It's been a long time that I'm asking for real world measurements,even whole system measurements to clear things.
Perfect conditions are far,far away from real world listening.

KSTR · Oct 3, 2022

@GXAlan,
very good effort, we certainly need more tests on this level of thoroughness and precision.

Alas, in my experience, one can never achieve the best possible resolution and robustness and most importantly, a reliable verification, of such difference tests with un-synced recording. That is, sample-synced record while playback (aka one integrated ADC+DAC device, both section running from one single clock) is required to really expose the fine grain of differences. As good as the Cosmos ADC is in term of standalone test, lacking any means of syncing to a source is a big drawback (and the reason I didn't buy one). For the same reason, a standalone source like a SACD is not the best possible option.

While DeltaWave is quite good at eliminating these dynamic differences from clock mismatch and drifts it can only do so much.
Notably the difference file is not very clean and thusly not directly usable for a verify test where we can "eliminate" the difference by adding (resp. subtracting) it as a pre-correction to the input signal for one (resp. the other) amp. Basically you emulated one amp with the other by pre-conditioning its input signal to forcibly match the output to that of the other amp.

IHMO and in my strong experience, the verify test is the most important test to make sure that the difference that was found is really responsible for the observed changes and it is absolutely mandatory to technically subtantiate the claims. It is also the only way to make such tests 100% repeatable.
It can even be extended to check for the influence of linear differences (frequency response changes, magintude and phase) and non-linear effects like distortion in isolation, seperated from each other. Uncorrected linear differences are often the dominant differences but in the end they are trivial as the can be inverted. Non-linear effects like the compression you seem to have found cannot be inverted. In many tests I've made, the linear differences were dominating the results even though they would seem to be irrelevant at first glance whereas striking differences in distortion often were actually inaudible.

With sample-synced technique using the difference signal directly for pre-correction, amp A should measure and sound close to identical to B (and vice versa). Further, one can actually listen to the difference signal as it is not disturbed by processing artifacts which would give false clues. Finally, sample-synced process allows for easy time-domain averaging which is alway welcome to reduce noise/hum/buzz and any other components not strongly correlated to the signal.

EDIT: Link to the outcome of some in-depth difference tests I made:

Making very small distortions/errors audible with music signals. Some examples.

Hi, Over the course of the last weeks I managed to set up and stabilize a procedure that allows to expose the error residual of Null-Tests á la DeltaWave. Actually I'm using DeltaWave for the final stage of display and analysis but it does only level matching fine-tining here, the rest is done...

www.audiosciencereview.com

restorer-john · Oct 3, 2022

KSTR said:
@GXAlan,
very good effort, we certainly need more tests on this level of thoroughness and precision.

Alas, in my experience, one can never achieve the best possible resolution and robustness and most importantly, a reliable verification, of such difference tests with un-synced recording. That is, sample-synced record while playback (aka one integrated ADC+DAC device, both section running from one single clock) is required to really expose the fine grain of differences. As good as the Cosmos ADC is in term of standalone test, lacking any means of syncing to a source is a big drawback (and the reason I didn't buy one). For the same reason, a standalone source like a SACD is not the best possible option.

While DeltaWave is quite good at eliminating these dynamic differences from clock mismatch and drifts it can only do so much.
Notably the difference file is not very clean and thusly not directly usable for a verify test where we can "eliminate" the difference by adding (resp. subtracting) it as a pre-correction to the input signal for one (resp. the other) amp. Basically you emulated one amp with the other by pre-conditioning its input signal to forcibly match the output to that of the other amp.

IHMO and in my strong experience, the verify test is the most important test to make sure that the difference that was found is really responsible for the observed changes and it is absolutely mandatory to technically subtantiate the claims. It is also the only way to make such tests 100% repeatable.
It can even be extended to check for the influence of linear differences (frequency response changes, magintude and phase) and non-linear effects like distortion in isolation, seperated from each other. Uncorrected linear differences are often the dominant differences but in the end they are trivial as the can be inverted. Non-linear effects like the compression you seem to have found cannot be inverted. In many tests I've made, the linear differences were dominating the results even though they would seem to be irrelevant at first glance whereas striking differences in distortion often were actually inaudible.

With sample-synced technique using the difference signal directly for pre-correction, amp A should measure and sound close to identical to B (and vice versa). Further, one can actually listen to the difference signal as it is not disturbed by processing artifacts which would give false clues. Finally, sample-synced process allows for easy time-domain averaging which is alway welcome to reduce noise/hum/buzz and any other components not strongly correlated to the signal.

EDIT: Link to the outcome of some in-depth difference tests I made:

Making very small distortions/errors audible with music signals. Some examples.

Hi, Over the course of the last weeks I managed to set up and stabilize a procedure that allows to expose the error residual of Null-Tests á la DeltaWave. Actually I'm using DeltaWave for the final stage of display and analysis but it does only level matching fine-tining here, the rest is done...

www.audiosciencereview.com

The sync issue is addressed by using just one channel (say L) from each amplifier into the ADC in stereo mode.

dualazmak · Oct 3, 2022

restorer-john said:
The sync issue is addressed by using just one channel (say L) from each amplifier into the ADC in stereo mode.

This would be simple and nice, I assume.

Furthermore, if we would try comparing 1. and 2. having the ADC in stereo mode;

1. amp-A-L into ADC-R, ampl-B-L into ADC-L
2. amp-A-L into ADC-L, ampl-B-L into ADC-R

we may confirm and validate the reliability of the ADC, right?

PierreV · Oct 3, 2022

solderdude said:
You simply can't test amps 'universal' as every speaker behaves differently and amp designs may react differently to back EMF or complex loads.
Therein lies the problem.

In my practical experience, with my own hardware, the following combinations are indistinguishable (I consistently fail blind tests)

Amp H - Speaker G
Amp H - Speaker F
Amp L - Speaker G

the following combination is reliably distinguishable from the others in blind tests (well, I can't run Amp L Speaker G vs Amp L Speaker F but you get the idea)

Amp L - Speaker F

After all the shuffling/testing I tried to investigate the difference a bit and it turned out that Speaker F is generally accepted to be "harder" to drive because of nastier impedance/phase measurements (per J. A. measurements in Stereophile). REW measurements _seem_ to confirm this but I wouldn't bet a little finger on that because the tests certainly weren't perfect. I also tried a cheap old Amp Y, not as extensively as I first thought it was useless, but it turns out that it can drive Speaker Gs and kind of chokes on Speaker F at normal listening levels. It does fail on both when you reach unsafe listening levels.

My own personal use limited conclusion, that I am not trying to impose on anyone but that satisfies me in practice, is that matching amps with loads matters much more than any eventual amps differences (Amp H and Amp L measure similarly according to the measurements I could find).

Measuring the "sound signature" of two different integrated amplifiers.

Major Contributor

Grand Contributor

Major Contributor

Major Contributor

Grand Contributor

Master Contributor

Grand Contributor

Master Contributor

Major Contributor

Grand Contributor

Major Contributor

Master Contributor

Addicted to Fun and Learning

Grand Contributor

Addicted to Fun and Learning

Master Contributor

Major Contributor

Grand Contributor

Major Contributor

Major Contributor

Similar threads