RE: [linux-audio-dev] LADMEA revisited (was: LAAGA and supporting programs)

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: RE: [linux-audio-dev] LADMEA revisited (was: LAAGA and supporting programs)
From: Richard W.E. Furse (richard_AT_muse.demon.co.uk)
Date: Tue Oct 02 2001 - 22:21:08 EEST


> -----Original Message-----
> From: owner-linux-audio-dev_AT_music.columbia.edu
> [mailto:owner-linux-audio-dev_AT_music.columbia.edu]On Behalf Of Paul Davis
> Sent: 30 September 2001 19:02
> To: linux-audio-dev_AT_music.columbia.edu
> Subject: Re: [linux-audio-dev] LADMEA revisited (was: LAAGA and
> supporting programs)
>
>
> First of all, I'd like to thank Richard for his insightful and very
> useful email. The LADMEA header file didn't make clear to me what his
> concerns were; this email has.
[...]

Oops, sorry. I'm very tight for time (it's one of my working years). The
header file is intended as a condensed, functional form rather than an
explanation. I am a mathematician after all! I'd rather hoped that folks
that had got used to my style from LADSPA might have the persistence to read
and absorb. But then it is 50% longer I suppose...

> >The essential idea is to extend from approaches like LAAGA/JACK,
> GStreamer
> >or aRts where there is an all-encompassing audio infrastructure (an
> >"exchange") into which various client applications are slotted. In the
> >LADMEA world, "clients" and "exchanges" communicate through the
> lightweight
> >LADMEA interface. Clients can load exchanges, exchanges can load clients,
> >some other program can load both (when in library form).
>
> This seems like a noble goal. It suffers, IMHO, from one major
> flaw. There is a level of "abstraction" and "API independence" that it
> is not worth moving beyond.

Hmmm. This again is my fault - abstraction with such things vanishes in the
implementation. I think folk will find that an example or three will make
things very intuitive very quickly (even if just on a copy/paste level). I
ought to try to find some time to finish the SDK.

> Lets just briefly review the goals here: we want streaming media
> (audio, MIDI, video, whatever) to be able to talk to each other. we
> don't care about the internal processing networks that a program
> uses. we care only that program A can route data to program B. unix
> pipes, the canonical method of doing this, suffer because they don't
> have enough bandwidth and involve needless data copying (with
> resulting cache pollution); shared memory, which solves these
> problems, cannot be used without another mechanism to serve as a
> thread wakeup signal. ergo, we need a new mechanism.

Yep, but SHM isn't the only answer. Sockets provide a particularly useful
implementation.

> Designing a system into which JACK, GStreamer and aRts can all fit
> seems great, but I immediately fine myself asking "is it worth it?"
> For example: GStreamer is self-avowedly an internal application design
> structure, not an inter-application design structure, and so it really
> doesn't apply to the kinds of problems that are being addressed
> here.

I'm not sure I agree here - I think the GStreamer team has a better grasp of
the real issues here than most. And by the way, why can't Ardour be turned
into a GStreamer plugin? The GStreamer team would probably appreciated the
changes required to make that happen.

> aRts and JACK ostensibly exist to solve mostly the same problem
> set. The claim is that aRts does not solve it and cannot solve it
> without a substantial rewrite. Although there is a history of APIs
> written to cover a variety similar-goaled, but differently implemented
> APIs, its not something that has tended to interest me very much. Its
> like the situation with pthreads and the wrappers for it: why bother?
> pthreads is a well designed, typically well implemented API and there
> isn't much reason to use a wrapper for it that i can see. buts that
> just me :)

I don't follow this. Whose claim is this? I'm not sure what the pthread
analogue is here.

> >This would mean that a newly written audio synth written using
> LADMEA could
> >immediately be slotted into LAAGA, GStreamer or aRts (assuming LADMEA
> >support) or use a LADMEA interface to ALSA. Correspondingly, if someone
> >writes a new framework for audio streaming over ATM (perhaps using
> >compression on some channels as bandwidth requires) then this can
> >immediately be used with the client applications such as
> recorders, players,
> >multitracks etc.
>
> The problem with these claims is that they are equally true of JACK
> all by itself. These goals/abilities don't differentiate LADMEA and
> JACK in any way. This is all true of the remaining claims in the
> paragraph.

Umm, sortof. All is fine as long as JACK is only used with PCM audio because
it's unlikely (!?) that we'll be using too many different compression
schemes here.

> >1. How can a client transmit a data type to another
> (potentially remote)
> >client across a exchange that has never heard of the data type?
> >2. How can a client know what existing channels it can use?
> >3. If clients offer variants of data types (e.g. 16bit unsigned
> >little-endian PCM, float l-e PCM, double l-e PCM, MP3), how can
> the exchange
> >persuade the clients to agree? If they cannot, how can the
> exchange go about
> >inserting a codec? Note again that the exchange may never have
> heard of the
> >data types involved.
> >4. How does a exchange know how much bandwidth it is likely to need to
> >communicate data (e.g. over a network)? How latent can data be? How much
> >jitter may be tolerated on a network? Note again...
> >5. How can an exchange know when a client has failed?
> >6. If sources go out of sync (e.g. audio streaming over an
> ISDN link acros
> >s
> >the Atlantic drifts in sync relative to local overdubs) then how does the
> >exchange know this is happening and deal with it. (Ok, most
> exchanges won't,
> >but they could if they wanted.)
> >7. Consider a case where none of the clients in a graph require live
> >operation, e.g. a MIDI player is playing a 10min piece where the MIDI is
> >sent to a soft synth which generates PCM audio that is then passed to a
> >sound file writer. Say this graph can be run live and uses 20%
> CPU time on a
> >single CPU system. An intelligent exchange should be able to
> work out that
> >the graph instead can be run "offline" in 2mins as there is no
> requirement
> >to synthesise in real time. The same thing applies for subgraphs and
> >exchanges should be allowed access to enough information to cache output
> >from subgraphs. Again, many exchanges wouldn't wish to support such
> >facilities - but there ought to be a way to make these things options.
>
> These are all excellent questions. My response to most of them is that
> 90% of more of the clients involved should not and would not care
> about the answers to any of the questions. They are questions for what

Yep, precisely. But the exchange needs some information about what the data
is. As long as you're dealing with PCM audio this isn't too difficult. And
even then, I'm not sure how you'd address all the points above. How would a
clever driver handle the points above?

> I have termed "drivers" in JACK (which is generally a client, but it
> plays a special role in that it provides access to a genuine h/w
> resource, such as an audio interface or network interface or video
> interface or whatever). the role of a conventional client frees it
> from any concern with almost all of the above questions. Its only role
> is to process/provide data when told/asked for it. I will return to a
> couple that are still an issue for me.
[...]
> The one remaining question that bothers me concerns data types that
> are not audio, especially where the timing requirement differ. We've

Yep, this is why LADMEA is more complex. The "M" is for multimedia. We need
to be able to stream other data types. MP3, MIDI, Csound instrument calls,
video, DFTs, assorted techniques I'm working on (don't ask).

> already seen great difficulty in programs that attempt to sync MIDI
> and audio i/o, because the audio i/o is block-structured and the MIDI
> data is typically event-based (i.e. non-streaming). Unfortunately, as

If streaming means "one second has a fixed data size" then we're in trouble.

> I've commented before, there is no easy way around this. Combining two
> data streams, one that streams and the other that is event based, and
> running them from a single clock, or even in the audio/video case, two
> streaming data sets with different clock rates, will inevitably lead
> to problems. We can try to minimize them as much as possible, but they

Or code an API where this is made explicit so the exchange can handle the
information as its capabilities allow. Yep, there are problems, but mostly
they can be solved. But only if the exchange can be made AWARE of the
problem!

> will never vanish. As for the data types themselves, I believe that
> although the code is not fully implemented for it, JACK includes a
> design to allow the handling of *any* data type by allowing a client
> to provide handlers for it.

Yep, with extensions. But I suspect that once the extensions are there,
you'll end up with something remarkably analogous to LADMEA. But longer. And
more complex ;-)

> >What worries me about LAAGA/JACK is that a very specific
> approach is taken
> >to exchange implementation that isn't right for some tasks (e.g.
> bandwidth
> >management, networking, handling of varied data types) although
> it is very
> >good at some other things (e.g. low latency PCM audio).
>
> I partly see your point here, and I partly disagree. Remember that
> JACK has *absolutely* nothing to do with the scheduling and delivery
> or data types, with the one exception being that it builds in an audio
> type (which could be avoided, in fact, but at a performance cost in
> the by-far-most-common case). Its only role, to use a LADMEA term, is
> as an exchange: a "driver" tells the JACK server "its time to get your
> clients to handle the data corresponding to N frames of audio", and
> the server does that. The driver may have lied. It may be buffering
> data up the wazoo. It may be routing data over a network that has no
> "PCM clock" like an audio interface. JACK doesn't care: its only job
> is to interconnect and schedule things that need to be scheduled, with
> the "when" and "how much" of scheduling being left to a "driver".

Yep, but if the client provides no information to JACK about the data type,
then the driver has no clue to determine what to do with it.

> > What I'd like us to
> >investigate is a way forward in which clients can still make use of
> >LAAGA/JACK when these features are the user's priority, but
> don't restrict
> >the exchange to this form.
>
> This gets to the heart of my feelings about this. What LADMEA is
> trying to do is to define something at a level "above" that of
> JACK. The functionality we've talked about in JACK to date is not the
> province of LADMEA as I understand it. LADMEA operates on/with
> exchanges (JACK being an example of one such) but doesn't define how
> they work internally as long as they meet the requirements (API) of
> being an exchange. Ergo, the design goals of JACK are in large part
> orthogonal to those of LADMEA. The only areas where they touch are
> LADMEA's attempt to define characteristics of an exchange which from
> JACK's point of view are irrelevant. So fine - JACK can just make
> something up (as long as its true). The more I think about it, LADMEA
> doesn't really have much to do with JACK at all.

Yep, I agree. Response to follow in another email on this thread.

> In particularly, IMHO remote inter-application
> >communication for audio software is HUGELY important for the future and I
> >think the Linux audio community really needs to address it sooner rather
> >than later. This API should be perfectly adequate for a "rewire"-style
> >exchange. It also should be fine for a world in which a studio
> is made up of
> >a network of rackmount Linux boxes, MIDI controllers, SP/DIF
> connections and
> >ADSL connections to other studios.
>
> Can you show me one area where JACK makes this difficult to do? I
> don't see any.

Take your pick: MP3, MIDI, random control data, encodings of spatial
movement, windowed DFT, compressed video, event data.

I don't mean to be arsey (although I probably have been :-/), I realise
you're speaking primarily of PCM audio and as said before, PCM audio isn't
too bad as bandwidth requirements aren't hard and it's a useful common
tongue for most audio processing.

And if you're happy to think of LADMEA and JACK as "orthogonal" (my
inclination) then we've not a lot to disagree about...

--Richard


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Tue Oct 02 2001 - 22:20:40 EEST