Re: [linux-audio-dev] Re: Plug-in API progress?

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] Re: Plug-in API progress?
From: Paul Barton-Davis (pbd_AT_Op.Net)
Date: to syys   23 1999 - 22:26:11 EDT


>...but clustering was just what I had in mind, which would mean that the
>engine, as you defined it, would be distributed over multiple machines.
>
>And I'm not suggesting that the whole cluster should be involved in _low
>latency_ processing - 50 - 100 ms latency can be perfectly fine in many
>situations, and is certainly a lot nicer than off-line processing from the end
>user POV.
>
>Also, I'm not thinking TPC/IP here. A (Beowulf class) cluster is a pretty
>specialized form of network anyway, and I'd use drivers ported to RTLinux,
>emulating the shared memory style IPC used on "real" supercomputers. That's a
>very big difference, and the _hardware_ isn't really the problem here. People
>are successfully using standard ethernet cards for real time streaming already

Sorry, this is wrong. I spent several years running those "real"
supercomputers (Sequent, KSR's, nCube, etc), and its not true that
there isn't a hardware problem. Real time streaming is a completely
different problem - its fundamentally bandwidth related. For
event-driven (and by event, I refer not to your proposed event system,
but to MIDI and X and timers) stuff, the individual message latency is
a fundamental problem. When I worked in the CS dept at UofWashington,
there was a graduate student working on this exact problem. Its very
very hard with regular networking hardware to get the latency for a
single message low enough to provide a shared-memory like
environment. There have been some really cool tricks that have enabled
Beowulf to take off, but a 100ms delay in response to a MIDI NoteOn is
going to make you the laughing stock of AES :) Beowulf is a really
cool and fabulous system, but its success is entirely in areas where
the workload can be divided into reasonable large portions that don't
require very much intermediate synchronization. Even on the KSR, which
had a *much* faster inter-processor bus than ethernet, people doing
heavy numeric processing that did not have this characteristic
(i.e. there was a lot of read/write activity on the mythical "shared
memory" that actually translated into invalidations of the local
processor caches) found that their performance sucked. they had to
switch to a NUMA model to get things to really fly, which was hardly
the point of the KSR.

So, I don't think that clusters are viable for real-time audio
generation when you want sub-100ms event latency. They *are* fantastic
as rendering farms, the way that ILM uses them, for example. Its easy
to imagine some very impressive audio generation taking place on a
cluster, but without any input devices to "disturb" the computation.

>> GTK is ported to these platforms. But thats beside the point. The UI
>> code *has* to run on the same host if you want low-latency interaction
>> between the interface and the engine. If you don't care about that,
>> then you have to devise a network IPC protocol to relay changes in the
>> UI to the master. This seems silly.
>
>Which would cause most network load and latency; GUI<->X communication, or
>GUI<->engine communication?

Thats a good question. I actually don't know. It depends on the nature
of the X stuff. X sucks for some kinds of event streams, particular
those involving lots of bitmaps going over the server. But for mouse
and key events, its pretty damn efficient. If the server has its
pixmaps already loaded up so that knob twiddling didn't involve any of
the costly stuff, then I wouldn't be suprised if X communication cost
no more than a custom designed (*and* debugged!) GUI<->engine communication.

[ ... the synchronization problem ... ]

>Well, I'm not really thinking about UI only here... What about automation? And
>what about plug-ins that accept weird data like strings and curves? (Video
>folks will like that...) It's still possible to code most "transactions" just
>by accessing the data in safe order, but it quickly starts to gets messy.

strings are just pointers to data. you store the data, then flip the
pointer, which is just as atomic as an integer or float. likewise for
curves.

>> >And, what about timing? How do you handle sample accuracy without going to
>> >single sample buffers? Events handle that in a clean and low cost way.
>>
>> No they don't. You can't do sample accuracy without reducing the size
>> of the engine control cycle (the number of samples it generates
>> between checking for a new event in some way) to 1.

>It's not really that simple... (Fortunately!) Sample accuracy doesn't
>mean tha t every plug-in has to check for a change every single
>sample. It just means that you *can* change *some* properties of
>*some* plug-ins at any singe sample position. It can be handled
>something like this:

>process(...**inputs, ...**outputs, ...**events, samples)
>{
> int current_sample = 0;
> int current_event = 0;
> int count;
> while(current_sample < samples) {
> /* process until next event should take effect */
> count = event[current_event]->time - current_sample;
> current_sample = [current_event]->time;
> while(count--) {
> process one sample;
> }
> /* handle event... */
> .
> .
> .
> }
>}

how can this work ? lets suppose that there are no events pending. you
just call process(..., 64) to generate the next 64 samples. someone
causes a MIDI CC message that is supposed to alter how the plugin
works. this is presumable queued in `events', but how is the plugin to
know to look for it ? it will be queued up after the terminator event
in the `events' "list" you pass in, so it can't see it during this pass.

so in this case, you've got a 64 sample event latency. to get this
down to 1, you've got to tell the plugin to only generate a single
sample.

the way that quasimodo (+supercollider +csound) would handle this is
that the thread handling MIDI input would cause a callback to run. the
callback would fiddle with the parameters of the plugin (without
talking to the plugin, or queing anything up anywhere), and if the
plugin is running, it will simply use the new value.

perhaps i'm missing something here, but it seems to me that you're
proposing a polling system with an event latency equivalent to the
number of samples generated per control-cycle/call to process(). This
view seems to be reinforced by the following:

>Ok, to put it simple: I build a structured description of what I want the
>plug-in to do, in the form of one or more events in a shared memory buffer. As
>the engine does it's event routing for the whole processing net for each turn,
>the events will get processed by the recieving during the next buffer period.

right, exactly - "during the next buffer period". so your event
latency is bounded by the control cycle size/buffer period.

>The only difference between "polling" once for each engine loop, and
>just altering the values directly in the DSP code's variables is that
>you get the same real time deadline for all plug-ins with the
>"polling" system. The resolution (leaving out event time stamps) can
>be one buffer in both cases, but the timing accuracy is independent
>of the latency with the event system.

directly altering the DSP code's variables doesn't change the
real-time deadline for all plugins. they are running without any
knowledge that their parameters are being played with. they just know
they are supposed to generate X samples and return. Quasimodo doesn't
want plugins to know the real time. they use DSP time, which is
constant during an entire control cycle. that is, the timestamp during
execution of the first plugin is *always guaranteed* to be the same as
that during the execution of the last plugin for any given cycle. it
is extremely rare that a plugin ever needs to know the "real" time.

--p


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : pe maalis 10 2000 - 07:27:12 EST