[linux-audio-dev] Random thought on HDR latency compensation

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: [linux-audio-dev] Random thought on HDR latency compensation
From: Paul Winkler (slinkp23_AT_yahoo.com)
Date: Wed Apr 19 2000 - 20:49:43 EEST


Hi,

This really has nothing to do with the recent thread on hdr, disk
performance, etc.

Maybe those of you (Paul, Kai, others?) working on projects that
provide hard-disk recording have already solved this problem, but
this just occurred to me (I have a habit of having irrelevant ideas
in the middle of trying to meet a deadline!):

A problem with full-duplex recording is that the inevitable buffers
cause the musician to hear the prerecorded tracks "late" and thus
their performance will end up out of sync when played back. The
larger the buffer, the worse the problem. When I used Multitrack for
this kind of thing, there was a compensation parameter that could be
set by hand to fix this. Setting it involved trial and error and it
had to be re-set if you changed your buffer size. Yuck.

Let me establish what I think "correct" behavior is. With an analog
tape machine, if you monitor off the record heads as is usually
done, you can copy a track just by patching the output of one track
into the input of another track. The amazing thing about analog is
that the new track is *exactly* aligned in time with the existing
track. If the tape machine had perfect fidelity, mixing the
resulting tracks with one track's polarity reversed would result in
silence. We should try to set up our HDR systems so they do the same
thing. It's true that people commonly introduce delays on individual
tracks for musicial purposes (e.g. altering the "feel" of a
performance) but for a recorder to be professional-quality this must
be considered an option; there should never be a delay unless you
want one.

How to fix it? If my thinking is correct, only the output buffer
matters here, as it introduces a delay WRT the hypothetical
"correct" time you would hear data from a "perfect" disk and
filesystem (throughput = inf, seek time = 0, thus no buffers are
needed). The input buffer slows down how long it takes input data to
get to disk, but since the musician does not hear this, it doesn't
matter (unless you're monitoring your performance "from the disk"
which is probably a bad idea, just like musicians don't monitor from
analog playback heads!)... the first sample from the ADC is still
the first sample that makes it to disk. If there were input latency
but no output latency, playback would still be "correct".

It's occurred to me that the amount of compensation needed for the
output buffer can be calculated automatically very simply, and the
method of applying it should be simple too.

The method is basically to discard an amount of data from the input
equal to the size of the output buffer. If the input buffer has the
same size as the output buffer, this would mean simply throwing away
the first full input buffer before starting to write to disk.

example:
N = output buffer = 1 kb = 1024 bytes per track
b = bytes per sample = 24 bits / 8 = 3
sr = sampling rate = 48000

so to find how "late" the musician hears playback:
N / (b * sr) = 1024 / 144000 = .00711 seconds = 7.11 ms
Not terrible, but not good.

So if we discard the first 1024 kb of the musician's performance,
the performance as recorded on disk will be perfectly aligned with
the pre-existing tracks.

...except for the latency introduced by the soundcard ADC and DAC.
Both of these matter (I think?). But this too can be compensated by
throwing away more data from the start of the input stream.
e.g. if we know that the total round-trip latency L of the soundcard
is .001 seconds, that translates to:

amount = L * sr * b
so in this case, the additional amount to discard is
.001 * 48000 * 3 = 144 bytes.

L would probably have to be determined for each soundcard, but you'd
only have to do it once since it should be a constant.

So in this example, our total amount of data to discard would be
1024 + 144 bytes = 1168 bytes.

It should be noted that that means discarding the first 8.11 ms of
recorded material... no problem, but with a really big buffer, say
512 kb, we'd be discarding about 3.6 *seconds* of material before
recording, so that might be a problem, especially for punch-ins.
OTOH, it will take 3.6 seconds before the musician hears anything to
play along with, no? (unless we have some kind of pre-buffering
scheme for fast startup...)

A (better?) solution would be if we had a system that used playlists
and simply marked new tracks as needing to start "early" by the same
amount... but this introduces another problem: you'd have to
remember this if you import data into any application that doesn't
understand the playlists.

Does this seem correct to anyone? Anybody care to test it? (I don't
have time or means to do so right now.)

--PW

................ paul winkler ..................
slinkP arts: music, sound, illustration, design, etc.
A member of ARMS -----> http://www.reacharms.com
or http://www.mp3.com/arms or http://www.amp3.net/arms
personal page ----> http://www.ulster.net/~abigoo


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Wed Apr 19 2000 - 21:23:45 EEST