Re: [linux-audio-dev] LAAGA API Proposal

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] LAAGA API Proposal
From: Richard Guenther (rguenth_AT_tat.physik.uni-tuebingen.de)
Date: Thu Jun 14 2001 - 17:19:14 EEST


On Thu, 14 Jun 2001, Paul Davis wrote:

> >Busses dont scale. You use Crossbars. Think of a mix node as of a
> >crossbar.
>
> How does a bus not scale, and what's a crossbar?

Crossbar is a graph - a bus is not a graph. A bus has collisions for
more than 2 members, a crossbar allows for multiple active connections
(if they are mutually exclusieve). But well, its probably off topic.
I dont need a bus - you seem to be advocating "bus-and-nothing-else".

> >>From an UI point of view I agree with not exposing the extra mix node
> >to the user. But at the scope of the backend there _is_ such extra node
> >(whether its builtin or not).
>
> Ah. Do you agree with it not being present in the client-side API? Or

No.

> is "not exposing it to the user" just a matter of it not showing up in
> the UI? The only thing I care about here is whether or not *clients*
> have to deal with mixing themselves ...

The clients have to - but they dont need to ask the user neither show
mixing in the UI. So we disagree here.

> >> If LAAGA is not sample-synchronous, then it doesn't really accomplish
> >> the goal it sets out for.
> >
> >You seem to imply that a async == not sample-synchronous??? Sample-
> >syncronous for me means if I process a stereo stream both streams are
> >"aligned" with sample precision. I dont get your point here - again
> >political, "I-never-did-it-this-way-so-its-bad".
>
> I resent that comment. None of my discussion here is "political", and
> I don't have a "Not Invented Here" mentality. I have strong
> convictions about the presence of real problems with GLAME's approach
> when used in a lowlatency/realtime situation. I view this discussion
> as a way of trying to establish whether i just don't understand
> GLAME's model, whether there are problems in my proposed model, how

Sorry, I didnt want to sound insulting - but we are arguing without
knowing the technical facts (well, both of us think we know them, but
suprisingly we disagree about them).

I dont think there are problems with your proposal if you just target
at being LADSPA for processes. Of course you'd have the same problems
as LADSPA and those we're arguing about here. Even if people think
LADSPA is fine for a single process (which I certainly can live with -
you get extra low-latency here), I dont think any advantages from the
LADSPA approach are valid if you use processes (you get even worse
latency than using threads in a process, like with the GLAME model).
Also I think if you are at the point that you are IPC cost bound you
have other problems...

> important any particular set of problems are, and so forth. I have
> said several times that I find GLAME's model quite elegant, and I mean
> this as a compliment. However, I continue to see difficulties with the
> way it works when its used in a situation where the "audioio"
> component (as you termed it) is faced with the task:
>
> "there are 64 frames of data and space available on this
> h/w interface. please grab the data, drive the graph, fill the
> space, and get back to me. please do all this within 666usecs"
>
> GLAME (like GStreamer) just doesn't seem to come from this kind of
> model, and at the moment, I have still have a hard time seeing how it
> can ensure that it can satisfy such a task in a reasonably
> deterministic fashion. I'm not trying to insist that it can't, just
> trying to understand how a system fundamentally designed to support
> async processing of audio can be guaranteed to work when "forced" to
> operate synchronously.

I said it many times: if you cannot process the data (i.e. run all
dependant process routines) in 666usecs neither sync. nor async.
operation will help you. You may loose another 5usecs if using
async. mode (but I dont see those 5usecs for LAAGA - certainly for
LADSPA vs. GLAME) - but thats peanuts. Also _if_ you happen to be
able to process a frame within 666usecs the order of execution
for sync. and async. is _exactly_ the same (you agreed on this
already) - as you are IO driven.

Can we agree on the above? Ok, then read on.

_If_ there is a latency problem
(weak kernel, or whatever) somewhere in the graph with _async._
operation you can even continue to fetch the next buffer from hw
(just give this app RT priority) and you'll perhaps catch up
later - with the sync. model you'd stall your IO if _any_ part
of the graph stalls (i.e. you dont support pipelining).

> >See above - you dont support independend processing of connections
> >from within one app.
>
> I don't understand what you mean by this. Can you explain?

If I have two independend streams (say you have two soundcards and
two people doing karaoke - no syncing requirements) feed into one
"intermediate" app (the karaoke filtering app) then this app will
get the independed streams within one callback. This does not scale.

   HW1 -->--\ /-->-- HW1
             app
   HW2 -->--/ \-->-- HW2

(ok, karaoke is highly artificial, but I'm sure someone has a more
useful example (for this example, just duplicate app and you're
done))

> > Also imagine the following:
> >
> > sample app 1 ----- stuff --\
> > sample app 2 ----- stuff ----- mixing, audio out
> > hd input --------- stuff --/
> >
> >all three "stuff" are independend and can be processed in parallel.
>
> i agree with you that being able to do this is a desirable goal. but i
> consider this to be a somewhat unsolved problem when applied to the
> low latency realm. the costs of synchronizing multiple threads as they
> seek to interact with common data structures will sometimes exceed the
> execution time of the operations they need to carry out. when you work
> with 64 frames at a time (and perhaps less in the future if a new bus
> architecture lets us), divide-and-conquer is not always a good strategy.

You have synchronizing cost, too. You didnt prove that yours are less
to mine (nor did I prove the opposite). Divide-and-conquer not always
being a good strategy also is not a reason for never doing it.

> >> once again, this is a free running system. a unit with an output port
> >> generates output at all times, whether they have connections or
> >> not.
> >
> >Sure.
> >
> >> the delay line produces silence before there is anything
> >> connected to it. ergo, there is no correct order. if you run the input
> >
> >So you have implicit (unsynced, for async. operation) zero input all
> >the time? I dont like this.
>
> why not? its precisely what is happening in the physical world when i
> have a bunch of gear connected up ... (well, i wish it was; that
> analog gear isn't close enough to zero for my taste).

I have to think more about this (will delay still be delay then?)

> if there's no input, and the components are driven by input, how can
> the graph run?

if you dont have input (no source) you dont have data to process and
nothing to output - there is no point in "running" the graph then
(of course it will be ready to run as soon as someone connects some
input)

> if the audioio component in your model is running, its
> capture side is feeding buffers to its connections. every part of the
> graph (except leaf nodes) has to execute in order for the playback
> side of the audioio component to execute correctly and on time, right?

yes.

> therefore, no part of the graph can be skipped just because one of its
> ports is not connected to anything.

yes, if it has inputs - "skipped" is misleading in case of outputs as
in this case the node itself drives the nodes connected to its output.

> my understanding of your model is that if there is no connection, then
> presumably getBuffer() will block

getBuffer() is carried out on a connection - if there is no connection,
you cant getBuffer(). Remember, I have only output ports and connections
(which are the source and the destination side). For output
(queueBuffer()), you wont block at all.

> and thats the end of the RT characteristic of the graph.

No.

> if getBuffer() doesn't block, then you're
> back to my model, in which every component can only be executed when
> all its input is ready (otherwise you will get audio glitches caused
> by missing certain buffers on each pass through the graph).
>
> did i miss something here?

Yes - sigwait() blocks - "and thats the end of the RT characteristic of
the graph".

The difference (at the receiving point) between my and your model is

     sigwait();
     process buffer A & B
     kill();

v.s.

     A = getBuffer(connectionA);
     B = getBuffer(connectionB);
     process buffer A & B
     queueBuffer(portForA, A);
     queueBuffer(portForB, B);

which, if A and B are independend can be done (with less latency) as

     select();
     X = getBuffer(readyConnection);
     process buffer X
     queueBuffer(portForReadyConnection, X);

So if you want to reduce IPC here, just introduce the explicit notion
of "dependant connections" - then we can reduce buffer getting for
those to

     B = getBuffers(group);
     process B[0], B[1]
     queueBuffer(portA, B[0]);
     queueBuffer(portB, B[1]);

but I dont know, if its worth the extra work - the implementation
would be trivial.

Richard.

--
Richard Guenther <richard.guenther_AT_uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
The GLAME Project: http://www.glame.de/


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Thu Jun 14 2001 - 18:36:48 EEST