Skip to main content

I was just doing my everyday browse through the web and nearly fell asleep until I found this pretty nice dissertation by two graduates at Detmold University over here in Germany which actually confirmed what I have always been thinking or at least been very suspicious about the industry's hype!

Unfortunately, the dissertation is in German, but for those who speak and understand it, here is the link:

http://www.hfm-detmold.de/texts/de/hfm/eti/index.html

It basically says that there was no big difference between 24/48 and 24/96 compared to the original analog signal. The two 48kHz converters they tested even bet the two 96KHz ones. Besides technical measurement, there was an audition. Most people chose the 48 signal for being the closest to the original. A couple of people even considered the 48 being the original. The two graduates claI'm that the quality of converters depends on a high SNR and a good analog circuit design. Note that this is just a very short and hence unprecise extract from the graduates' dissertation.

Comments

anonymous Tue, 03/08/2005 - 18:45

yes your talking about basic waveforms , a 1,100 hz sign wave is a simple wave but what about a summed up wave that is the sum of numerous ammounts of "pressures" or waves that at any given time can sum to any angle of direction? connecting the dots just doesent sound that eazy, being that the reason why your voice singing in the key of middle c sounds probably way different than mine singing in the same key. it's the summed up waveform or composite wave that makes every individual sound unique. And that summed up wave is a complicated wave with many various curves and angles. That logic makes sense in my head.

anonymous Tue, 03/08/2005 - 19:27

perfectwave wrote: yes your talking about basic waveforms , a 1,100 hz sign wave is a simple wave but what about a summed up wave that is the sum of numerous ammounts of "pressures" or waves that at any given time can sum to any angle of direction? connecting the dots just doesent sound that eazy, being that the reason why your voice singing in the key of middle c sounds probably way different than mine singing in the same key. it's the summed up waveform or composite wave that makes every individual sound unique. And that summed up wave is a complicated wave with many various curves and angles. That logic makes sense in my head.

Perfect,

Yes, you are correct. It is all very complicated. Each instrument - each resonating device has a very complex mathematical formula based on its elasticity that determines the frequencies it will produce. This, in combination with how it is struck, where it is struck, what it is struck with, the acoustical environment, etc., make the result a very complex waveform. At any point in space we can measure the amplitude of the resultant sound waves over time and find that what is created is an infinite number of frequencies. Because an infinite number of frequencies are present it would be impossible to represent this waveform with anything short of an infinite number of samples (analog).

The ear, however, cannot hear an infinite number of frequencies. The ear filters the sound waves prior to sending them to the brain. Between the tympanum (ear drum), the ossicles, the oval window, the basilar membrane and the tectorial membrane (all subjects of mechanical filtering) and then the location on the basilar membrane of the inner hair cells, the size and thus reaction of the inner hair cells, and the chemical behavior of the inner hair cells, the sound that eventually gets transmitted to the brain is significantly changed from the sound that can be found at any given point in space.

This filtering removes all frequencies above 20kHz from the initial sound wave. What does this mean? Think, for a moment, about square waves, sawtooth waves and triangle waves. All three contain straight lines, sharp peaks, corners, etc. These waveforms all contain an infinite number of frequencies (all odd frequencies, etc.) The presence of steep lines, sharp corners, straight lines, etc. all imply the existence of infinite frequencies. Once you start filtering these waveforms to only their lower frequencies we notice that the waveforms no longer contain straight lines. The lines become "wobbly" (triangle and sawtooth waves) or "crooked" (square waves). Sharp corners become "rounded." Horizontal lines have "ripples." These changes are indicative of the inability for high frequencies to exist. When you filter the waveform you make it much more "predictable." There is much less "erratic" behavior and the behavior is much more easily identifiable and organized.

This is a key principle in the application of the Nyquist theorem to audio. We already know that we do not need to record an infinite number of frequencies for the benefit of the ear. We only need to present to the ear what the ear can actually hear - in frequency, amplitude, phase, and dynamic range. Ignore the latter three of those, for in this discussion we're only concerned with frequency. The ear cannot handle material over 20kHz, so the first thing we can do is filter out all frequencies above 20kHz before we even get started.

That is an important point. If we filter out all of the material above 20kHz we eliminate a lot of the "unpredictability" or random behavior in the waveform and we relegate it to moving in much smoother, rounder shapes (as you zoom in). We can NOW apply the Nyquist theorem to the waveform.

As I stated above, if you take samples of the waveform there is only one possible way (by mathematical law) that a line can be drawn through those points such that it conforms to the "legal" frequency band. You can't use straight lines, for that implies higher frequencies. You can simply "connect the dots" for that also implies straight lines. You can't have extra "squiggles" in between the dots because those extra "squiggles" are also indicative of higher frequencies. There is only ONE waveform that can be drawn through the sample points, and that waveform is the original one. If the D/A converter does its job properly this will be the resultant waveform - for any other waveform would contain "illegal" frequencies - according to mathematical law.

Therefore, the sample theorem works for audio because we first can filter the material, substantially limiting its potential behavior, and allowing it to conform to Nyquist's theorem. Then we can sample it. Once we have done so we have limited the possible waveforms represented by those samples to exactly one - the original. If this is done properly then the result will be physiologically identical to the ear.

I hope this makes sense?

Nika

Cucco Wed, 03/09/2005 - 10:23

Wow!!! I can't believe I missed this one for SOOOO long.

Guys, there is some amazing discussion going on here. One worthy of college or even graduate level courses. But, I must point out a little of the obvious.

First, I believe the main question or thought had to do with the concept of higher sample rates being unneccesary due to the human physiology's inability to interpret them. This is the flaw that I would like to discuss. The mathmatics are fascinating and a etched in stone fact that none of us can truly dispute, but the physiology and the affects of our physiology on perception is, at least to me, a more important aspect of this discussion.

The Cucco Theory:
Perception is more powerful than science.

To say that the human ear cannot hear frequencies above 20kHz or below 20 Hz is a fallacy. In fact, the tympanic membrane will vibrate when presented with a single cycle per second or even (in some individuals' cases) as high as 75-80 thousand cycles per second. (Of course, physical limitations do apply and those with hearing loss or damage will find this not to be the case.)

It is true that the brain ultimately filters this information out, but at what point is beyond the scope of modern science and medicine. It is clearly determined that the sound captured at the tympanic membrane and that recorded at the basilar membrane are different. And you are correct, much of this is due to physical filtering. But the vibration of the micro-hairs tied to the nerve cells which interpret sound are a mystery. We know very well what effects sound have on these hairs, but what we don't know is the physical make-up of individuals' anatomy when it comes to those hairs. There is an average that has been determined by examining multiple subjects, but even on those multiple subjects, mass anomolies exist in such extremes that it's hypothesized that some or even many humans are capable of hearing far in excess of the prescribed "frequency range of humans."

How this frequency is perceived by the brain is a COMPLETELY different matter. It is clearly evident that many humans are capable of physically receiving fast vibrations or even incredibly slow vibrations, but the mind apparently doesn't do anything useful with this information. Or does it?!?

Because of the extreme directional characteristics of higher frequency sound sources, it is speculated that the frequencies that we hear but "make no use of" are instrumental in determining placement of sound sources within our environment.

An interesting experiment would be:
Place an individual in a completely dark environment with a blindfold on (just in case). Place a true omni mic (such as the DPA 4061) on either side of the individual's head. Place headphones on him/her with a brick wall filter at 20hz and 20kHz. Then, create sounds within the environment but at random locations. You would find that the individual would be able to locate the source of those sounds relatively easily. (Not quite as easily as no headphones at all.)
Now, drop the upper filter to 10kHz. Localization becomes far more difficult.
Drop the filter to 1kHz. Localization becomes almost impossible.
Take it down one more octave and there would be no localization at all - and frequencies in this range have not reached their "mono" range yet. Wierd...

To simply "lob" off frequencies beyond the imposed 20kHz limitation should sound rather unnatural. In nature, we are capable of perceiving these frequencies. Notice, I didn't say "hear" these frequencies. Hearing is a little different than plain ole perception.

So, science tells us a little bit of a contradiction -

*Humans can't hear above 20kHz or below 20 Hz.
*There are physical changes in physiological conditions when the human hearing aperatus is presented with information ranging as much as 2 octaves above the aforementioned limitation.

Which one is right? Just because you can blow a pitch at 25,000 Hz and I can't tell you that it's happening doesn't mean my brain isn't using this information in a different way than just "hearing" it.

I'd be curious to measure the brainwave patterns of a subject being subjected to frequencies in excess of 20kHz versus that of silence or a 1kHz sine.

Just some thoughts.

Battle amongst yourselves....
:lol:

J...

anonymous Wed, 03/09/2005 - 10:32

Jeremy,

Yes, there are two different discussions (the same two that occur wheneever sample rate issues are brought up): 1. Whether Nyquist does or does not actually work, and 2. Whether higher frequencies should be present anyway.

I think you and I are in agreement on #1.

With #2 I see it as exteremly unlikely that high frequency vibrations are transmitted to the brain through the ear. It is possible that they get to the brain through other senses, but by the time we get above 4kHz the frequency response of the ear is relegated to "place theory" anyway, as synchronous firings of hair cells stop at around 1kHz and asynchronous firings stop at around 4kHz. Higher than that and we're dealing with location on the basilar membrane, and the amount of location available for frequencies north of 20kHz is extremely small and there aren't enough hair cells there anyway.

Having said this, there are times when the body can recognize the presence of low frequencies and I'll grant that there may be more work to do with high frequencies - per Oohashi. To date, however, I've seen nor heard any evidence that substantiates this other than Oohashi, which has been broken down in the scientific community such that it is under great suspicion at this point.

Nika

Cucco Wed, 03/09/2005 - 10:52

That made me laugh.

Not in a bad or "snyde" way, but in a more geniune manner. I've heard this comic - Mitch Headberg - very strange, but quite funny. He tells a joke:
"My friend and I were walking down the street the other day and he said 'I hear music.' And I said, 'so' that's how we all take it in. I mean, I've tried tasting music and it just didn't work..."

I guess it's funnier to hear it from this guy, he's friggin hilarious!

Anyway - I agree with you almost 100% But the fact is, that there are those with the appropriate amount of cells within the very small portion of the membrane dealing with higher frequencies and of course the membrane itself which do react to these ultrasonic frequencies. Again, the measurement of a reaction is a non-disputable and proveable fact. It's the interpretation of that reaction that begs the question - "what do we do with this information now?"

I for one am not ready to discount the potential that this information is useful, especially when there are so many out there that claim to hear differences. (I'm not saying one way or the other whether there is a significant sonic difference, other than my own personal tastes, and for me, I believe I can hear a difference.)

J...

Reggie Wed, 03/09/2005 - 14:01

I've heard this comic - Mitch Headberg - very strange, but quite funny. He tells a joke:
"My friend and I were walking down the street the other day and he said 'I hear music.' And I said, 'so' that's how we all take it in. I mean, I've tried tasting music and it just didn't work..."

That guys is great; crazy hippy dude. He is kind of like Steven Wright in the sense that it is much funnier to hear/see the joke as performed by the comic.
I have nothing valuable to contribute to this thread, except 88.2 or 96K sounds better to me than 44.1 . Even if only just a little bit. But I am young and haven't totally destroyed my ears yet. :lol:

anonymous Wed, 03/09/2005 - 15:02

Reggie,

But I am young and haven't totally destroyed my ears yet.

Well, you have time to work on it, then. But it's good - two human
being confessed that they hear it. I think, Jeremy made some nice
points here. I just would like to add that 20 years back, when they
come up with the audio standard of 44.1 kHz (most of sources equal
it to 22.05 kHz Nyquist frequency), it seems someone was willing to
let some slack instead of cutting off at 20 kHz sharp. I'm sure, the
problem is more complicated (time jitter, bit depth, filters, ecc), but,
let's get more slack...

Costy.

anonymous Wed, 03/09/2005 - 15:54

Jeremy,

I'd be curious to measure the brainwave patterns of a subject being subjected to frequencies in excess of 20kHz versus that of silence or a 1kHz sine.

That is a cool question. Also a complicated one. From what I know, the
neural cells of brain keep firing (reacting) even during silence. But it's
random. The responce to signal is a switch from random to some
kind of strange attractor. It's sort of a function but not deterministic,
comes from chaos math. It's hard to make an analogy... Let's say
you see a coffee mug. It can be red, black, white, broken handle, dirty
or clean. The properties are chaotic, but you know it's a coffee mug.
It's similar with the sound.
The experiments you have mention are going on.... We just have to
wait when the guys figure it out. It may take some time for the chaos
math (and neuro mapping of brain) is quite recent thing.
Cheers,

Costy.

anonymous Wed, 03/09/2005 - 16:19

Costy wrote: That is a cool question. Also a complicated one. From what I know, the
neural cells of brain keep firing (reacting) even during silence.

It's actually the ears that produce the random firings. It's called the "spontaneous discharge rate" and this produces an underlying level of "noise" that is fed to the brain. It is quieter in level than the pulsing of the blood through your ears. The way we determine audibility and physiological reaction of inner hair cells is by feeding a stimulus (wave) into the ear and comparing the spontaneous discharge rate for various hair cells with the rate when a signal is presented. All hair cells have different spontaneous discharge rates.

The experiments you have mention are going on.... We just have to
wait when the guys figure it out. It may take some time for the chaos
math (and neuro mapping of brain) is quite recent thing.

It has been done before in the aforementioned Oohashi study (I have two copies of this study) but this study is largely discounted nowadays.

Nika

anonymous Wed, 03/09/2005 - 16:23

Costy wrote: Reggie,

But I am young and haven't totally destroyed my ears yet.

Well, you have time to work on it, then. But it's good - two human
being confessed that they hear it. I think, Jeremy made some nice
points here. I just would like to add that 20 years back, when they
come up with the audio standard of 44.1 kHz (most of sources equal
it to 22.05 kHz Nyquist frequency), it seems someone was willing to
let some slack instead of cutting off at 20 kHz sharp. I'm sure, the
problem is more complicated (time jitter, bit depth, filters, ecc), but,
let's get more slack...

Costy.

I don't know any designer nowadays who wouldn't jump at the opportunity for some more slack. The bandwidth available for the transition bands of the anti-aliasing and anti-imaging filters (20kHz to 22.05kHz) provides for the need for extremely sharp filters (1000 pole or so). Easing this would allow designers of chips to spend less DSP resources on the filtering in the chips.

Having said this, the sacrifice is greater DSP throughout the entire remaining structure of the digital audio system. Research over around 35 years now generally indicates that an ideal sample rate for audio transparency, design, cost vs. benefit, latency, etc. would be about 60kS/s. Many engineers have endorsed such a format but it simply won't happen. 96kS/s has its disadvantages, as do the lower rates, none of which can't be overcome.

Nika