Re: [linux-audio-dev] Re: Bandlimited interpolation suitable for realtime audio

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] Re: Bandlimited interpolation suitable for realtime audio
From: Julius Smith (jos_AT_w3k.org)
Date: su syys   12 1999 - 16:34:43 EDT


At 06:21 PM 9/9/99 +0300, Juhana Sadeharju wrote:
>Audiality wrote:
>>Don't know if they're using something like that, but E-mu claims to have
>>"8-point interpolation" in their newer samplers. Unfortunately, I'm afraid
>>asking them how they do it wouldn't work! ;-)
>
>Check good papers on the topic:
>
>Dattorro, J., 1997, "Effect Design, Part 1: Reverberator and Other Filters,"
>Journal of the Audio Engineering Society, Vol. 45, No. 9
>
>Dattorro, J., 1997, "Effect Design, Part 2: Delay-line Modulation and Chorus,"
>Journal of the Audio Engineering Society, Vol. 45, No. 10
>
>I have no idea where is Part 3 but Dattorro is at Stanford...

Part 3 was never published due to the objections of a former employer of Jon's.

>The part 2 tells: Proteus sampling synth and its relatives all employ
>seventh-order interpolation polynomials. They are not exactly "Lagrange"
>though. They use a technique in which a Remez exchange is applied to an
>"ideal" filter response similar to that of Lagrange, but having lower maxima
>in the stopband. This gives the deep notches advantageous in the Lagrange
>approach, but also the superior stopband rejection of a sinc-based
>design. For more information, see the U.S. patent on the fundamental E-mu
>G-chip interpolator; no. 5,111,727, David Rossum.

Yes, this is a nice filter-design method for use in the general bandlimited-interpolation context (which Emu has done for many years). The "neat new idea" here is forcing notches at all multiples of the original sampling rate. Since most audio energy is concentrated at relatively low frequencies, which translates to being concentrated at all multiples of the sampling rate, notches there are a nice improvement over classical lowpass designs. However, such a design is not optimal. The "right thing" here (if you are into total optimality), is to shape the filter stopband according to the inverse of the average spectral envelope. This level of optimality can be pursued in the context of "constrained convex optimization":

@INPROCEEDINGS{PutnamAndSmithMohonk97,
        AUTHOR = "William Putnam and Julius O. Smith",
        TITLE = "Design of Fractional Delay filters Using Convex Optimization",
        BOOKTITLE = "\Proc\ IEEE Workshop on \Appl\ Signal Processing
                        to Audio and Acoustics, New Paltz, NY",
        PUBLISHER = "{IEEE} Press",
        ADDRESS = "New York",
        MONTH = "Oct.",
        YEAR = 1997
}

(A copy of this paper may be on the Stanford Music 421 website under "handouts".)

In addition to providing better filters than the Emu approach, it may dodge their patent in this area. (I have not seen it, so I don't know how general the claims are.)

>Check also:
>
>T. Laakso, etc., 1996, "Splitting the Unit Delay," IEEE Signal Processing
>Magazine, January 1996

Yes, this is THE classic paper on interpolation. However, be prepared for "heavy on the math".

>>In that case you could upsample once when loading the .wav file
>>using a high-quality converter. This would increase the load time, of course,
>>which could be a problem if you want to support dynamic loading of wave files
>>in response to MIDI "program select" during a musical performance.
>
>There would be pre-upsampled files available, of course. :)

I was under the impression Benno wanted to support any standard wave files provided by the user. After all, that's what "sampling synthesis" is all about. Yes, obviously any .wav files which are provided with the system will of course be resampled in advance to whatever is rate is best for the system.

Note that the ideal rate depends on the signal contained (how many harmonics), so ideally the .wav file should be able to be at any rate. The playback engine takes the rate into account when computing the base step-size through the table. After loading the .wav file into memory and computing the base step size in samples from the file's rate and the current system output sampling rate (usually 44 kHz, but 22 kHz and others should be supported as well), the base step-size is scaled according to pitch-shift amount, irrespective of the table's native sampling rate. Thus, supporting any file sampling rate with any output sampling rate should have no effect on the "inner loop" of the playback engine. Therefore, it's basically "free".

> If I don't hear any difference between 2-tap and 4-tap interpolation, it doesn't
>matter if I use 2-tap interpolation only.

There you go. Good for you. To check for bugs, listen also to worse and better interpolators, and on a variety of signals. A good "screw-case" signal is the impulse train. It has harmonics all the way out to the Nyquist limit, so any aliasing is highly audible. To make the test even more "acidic", work at low output sampling rates such as 8KHz. At that rate, you should be able to hear a progression of improvement with increasing filter order. To be really rigorous, you should predict the amount of aliasing you expect from the filtering characteristics, and check in Cool Edit or something to see that you're where you should be, but this is probably not necessary. After you are pretty sure there are no bugs, go back to 44 kHz and listen to a variety of "reasonably likely signals". Perhaps the closest thing to a "screw case" in the real world is "virtual analog" synth sounds, such as a sawtooth waveform or pulse train (with the pulse width set to minimum), and with the VCF (lowpass filter) opened up to
 maximum. In the acoustic world, brasses are among the brightest signals on the planet (loud note, recorded along the horn axis).

Julius


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : pe maalis 10 2000 - 07:27:12 EST