Re: [LAD] twice as loud

From: Charles Henry <czhenry@email-addr-hidden>
Date: Tue Jul 27 2010 - 23:44:33 EEST

On Tue, Jul 27, 2010 at 12:38 PM, Ralf Mardorf
<ralf.mardorf@email-addr-hidden-dsl.net> wrote:

> It's not impossible. I guess nobody is able to note, let's say, 10 000
> pictures a second as single steps for a movie, of course you and I
> aren't able to note it for just 30 pictures a second. But I don't
> believe in digital audio math, on the niveau we reached until today.
> Btw. I don't have knowledge of this math, I'm just listening and have
> long time experience with doing analog recordings.
Ah! That's just a bandwidth limitation, but it's a rather good
example for the mathematically inclined. Let's say we're just talking
about the set of sounds that are 1 second long or less.

I'd like to show that the human auditory system performs a significant
reduction in the dimensionality of sounds. Start with sets of signals
on [0,1] that have finite energy and power: s(t) on [0,1] is finite,
and the integral of s(t)^2*dt on [0,1] is also finite.

Q: So, how many dimensions do we start out with?
A: infinite--this is one example of a Hilbert space. The
dimensionality is clear by application of Fourier series. We can
represent functions in this space with a series of orthogonal
functions (sines and cosines), but to represent *all* functions in
this space, the series has to be infinitely long.

Q: Now suppose we limit the bandwidth to 200 kHz. How many
dimensions do we need?
A: 400,000. By Nyquist's sampling theorem, we need 400,000 samples
to represent continuous signals up to 200 kHz. Either by
sampling/reconstruction or Fourier Series, we can show that our space
is homeomorphic to R^400,000.

So, your own example shows that if we increase our bandwidth
arbitrarily high, we can't tell the difference anymore. The auditory
system is bandwidth limited in this way--typical rule of thumb is
about 20kHz of bandwidth. We represent these continuous sounds with
samples at a rate more than twice the bandwidth. So typically, we
sample at 40kHz and above. Real acoustic sounds can have a lot of
extra frequencies above 20kHz, so sample at higher rates to reduce
aliasing of those frequencies onto the auditory band. No further
increases in quality can be obtained by sampling at faster rates.

Regardless, it's a gigantic number of dimensions. The essence of
psychology is the study of mental representations. How can each of
those things be represented in the mind? The problem becomes, what is
the smallest integer-dimensional space into which we can embed the
space of all sounds? This is not a problem that has been solved, nor
do I prescribe how to take such a measurement.

But finding such a result is the *exact* problem to solve in
psychoacoustic coding. It's reducing a problem from a set which takes
a large number of points to represent all possibilities to a set which
takes the fewest number of them.
_______________________________________________
Linux-audio-dev mailing list
Linux-audio-dev@email-addr-hidden
http://lists.linuxaudio.org/listinfo/linux-audio-dev
Received on Wed Jul 28 00:15:04 2010

This archive was generated by hypermail 2.1.8 : Wed Jul 28 2010 - 00:15:04 EEST