Subject: Re: [linux-audio-dev] MuCoS, Glame, API Issues
From: Richard Guenther (richard.guenther_AT_student.uni-tuebingen.de)
Date: pe maalis 10 2000 - 05:09:19 EST
On Wed, 8 Mar 2000, David Olofson wrote:
> On Wed, 08 Mar 2000, Richard Guenther wrote:
> > On Wed, 8 Mar 2000, David Olofson wrote:
> >
> > > On Tue, 07 Mar 2000, Alexander Ehlert wrote:
> > > > I try to summarize the most important features and explain the terms we
> > > > use:
> > > > - a single effect is called filter(ok, I now it's bad...), a filter is
> > > > some kind of plugin that basically send/receives data through ports
> > > > - ports can be used for any kind of protocol, if you need a new one,
> > > > define one :), currently we've got a only a protocol for transfering
> > > > sample buffers. All protocols are just extensions of a basic buffer
> > > > protocol
> > >
> > > * Do you have to skip through linked lists or the like to find ports?
> >
> > No, they're hashed by name. Also theyre not allowed to change (i.e.
> > appear/disappear) during the time the plugin is registered.
>
> Ok; that is, the engine can keep direct pointers to ports to use as
> long as the net doesn't change?
Umm, I dont know what you mean with engine. But before launching the
network (i.e. start processing) speed is not an issue, i.e. a lookup by
name in this "domain" is ok, so no need to check for changes (that can
appear at this stage). After launching the network the network doesnt
change anymore, that is in the initialization part of the plugin thread
a lookup by name can be done and the resulting pointer cached (as it is
not allowed to change at this point). So, basicaly yes.
> > > * When is a plugin allowed to add/remove a port?
> >
> > Only at registration point (and only add). This may seem a limiting
> > factor, but note that the input/output ports are only hooks for the actual
> > connections and that one port can be connected multiple times - so usually
> > filters do have one input and one output port only.
>
> Ok.
>
> > > * Can this be done in a separate thread without stalling the engine
> > > thread until the whole operation has been performed?
> >
> > Its thread safe as at the time you are allowed to add stuff the filter is
> > not allowed to be registered (be possible in use). What's this "engine"
> > thread?
>
> In low latency real time systems, the only reliable (and simple
> enough to actually realize) way it to run many plugins together in
> one or a few threads. The most common approach is to have a single
> engine thread that calls plugins one by one in a strictly controlled
> order, as defined by the net description. This means less thread
> switching overhead and no plugin/plugin CPU time interference.
Ok, so its kind of "software scheduling". Obviously if you have n
processors available in your system you would have n engine threads? Well,
the scheduling problem will be quite complex in this case considering that
the network is a directed (acyclic?) graph. Its NP-hard, isnt it?
> > > > - filters can be combined to filternetworks, every filter runs as a new
> > > > thread(we're using pthreads)
> > >
> > > Although this allows a great deal of flexibility, it's definitely a
> > > no-no for low latency audio in anything but RTOS environments like
> > > QNX or RTLinux. There is too much overhead involved, there is a risk
> > > of your engine being stalled by some kernel operation (even on
> > > lowlatency kernels!), since each switch involves the kernel directly,
> > > there are problems with scheduling multiple RT threads in low latency
> > > settings (no preemptive/timesharing switching among SCHED_FIFO
> > > threads), and some other things...
> >
> > RT threads are bad.
>
> No. They are *required* for certain tasks.
Ok, lets lay for glame they're not an issue (right now) :)
> > They're messy in case you get more than one of it.
>
> It's just a different kind of programming. Not always trivial, but
> you can't avoid it and still get the job done, unfortunately.
>
> > As for the latency, on a P100 I get a latency caused by the thread/socket
> > overhead (on a modestly loaded system) of ~0.06ms using 3 threads, ~0.15ms
> > using 10 threads, with 2000 threads the latency (measured by "pinging"
> > through a linear loop of connected filters) goes up to ~10ms - this is not
> > to bad (its a P100!!). A quick test on a dual PIII-550 shows (it has a
> > load of 1 at the moment) ~5ms using 256 threads (the 3 threads case is
> > ~0.05ms) - so the latency is not strictliy cpu bound. An IRIX box w 4 cpus
> > shows ~0.2ms for 3 threads and ~8ms for 80 threads (cannot go up with the
> > # of threads, seems to be limited per user).
>
> We're talking about *peak* latency here. Drop-outs are not acceptable
> in a real time audio system, and the worst case scheduling latency of
> the kernel can appear as frequently once per kernel call, if you have
> too much bad luck. (Which happens all the time in a real system.)
Well to prevent dropouts we do buffering. Of course the maximum buffer
size determines the "peak" latency in the net.
> > > > - pointers to buffer heads are sent via unix domain sockets. To
> > > > avoid memcpy's we do reference counting on buffers. filters can
> > > > make buffers private, if this is done by two filters a copy of the
> > > > buffer is made. Otherwise zero copy read-modify-write buffer
> > > > modification is possible on private buffers, that means processing
> > > > is done in place. After the buffer is processed it can be queued into
> > > > the next pipe.
> > >
> > > This sounds like quite some overhead to me... Also, there shouldn't
> > > be any kernel calls (and preferably no function calls at all)
> > > involved with the execution of a single plugin.
> >
> > no function calls???
>
> One function call to run the plugin. No function calls for getting
> data or events. No function calls for transmitting data or events.
>
> It's not really a big difference in the normal case, but obviously,
> it can only work as long as you don't need to ping-pong between two
> or more plugins within the time of a single buffer/call time frame.
>
> > you cannot do this without threads!?
>
> Well, one engine thread is quite handy ;-), but more is just extra
> overhead and complexity. (Especially in the timing aspect.)
>
> > I think
> > you are vastly overestimating the overhead of syscalls, function calls
> > and thread switching time! For the overhead of this see the above
> > latency measures.
>
> It's not the average overhead. It's the fact that you may be stalled
> by some other part of the system (kernel) holding a spinlock - and
> miss your deadline, even though there is unused CPU time.
Of course, but the kernel does not provide hard RT either, so you may get
stalled for too long time anyways if you are RT or not.
> > > Anyway, sorry for the harch reply - and for being incomprehensible,
> > > if that's the case. That's just because we've been discussing all
> > > forms of plugin API issues for some months around here, and
> > > personally, I'm very interested in hard real time issues, as that is
> > ^^^^^^^^^^^^^^
> > Linux is not hard real time. Its soft real time.
>
> Standard Linux, yes. With the lowlatency patch (and hopefully 2.4),
> this is no longer the case.
Err? Linux is not hard realtime - even with the lowlatency patch. Hard
realtime is about ensuring a maximum latency for everything. (Of course if
you set this latency to 1 hour, linux is hard realtime ;)) Linux with the
lowlatency patch can only guarantee that f.i. you get <10us latency with
a high probability - this is not hard realtime.
> > And soft is relative here, too.
>
> Soft vs. hard can never be relative. A missed deadline is a missed
> deadline, period. (Ok, allow for hardware failures and the like.
> There is no such thing as a true guarantee in the real world.)
>
> > Audio is not hard real time stuff either
>
> It is indeed. Drop-outs during live concerts may cost you your job
> and reputation. (That's why most live systems are still analog - not
> even dedicated digital systems are reliable enough!)
>
> > - until you want
> > ~1 sample latency from audio input to output. There is buffering!
>
> Buffering has nothing to do with it! The definition of a hard real
> time system is that it can quarantee a correct response within a
> certain maximum time frame. That time frame might well be 2 hours,
> but it's still a hard real time system.
Hehe :) There's the 2 hours... - make it say 1ms and linux will not meet
the criteria for hard realtime.
> > > the main reason I got into this in the first place. (Windows and Mac
> > > systems are incapable of useful real time processing without
> > > dedicated hardware...)
> > >
> > > Your project seems very interesting, though. :-)
> >
> > Thanx. The main concept of glame is to have very simple and generic APIs
> > where writing a special foo filter is easy (just 10-20 lines of code
> > overhead for a usual effect). Threads do allow this. If you dont like
> > threads you can "schedule" manually by using either MIT pthreads (software
> > thread library) or schedule via setjmp/longjmp (ugh!).
>
> The callback model isn't more complicated, it's just a slightly
> different way of thinking. Usually, the inner loops end up
> identical, for performance reasons...
This archive was generated by hypermail 2b28 : su maalis 12 2000 - 09:14:06 EST