Re: [linux-audio-dev] LADMEA revisited (was: LAAGA and supporting programs)

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] LADMEA revisited (was: LAAGA and supporting programs)
From: Paul Davis (pbd_AT_Op.Net)
Date: Sun Sep 30 2001 - 21:01:35 EEST


First of all, I'd like to thank Richard for his insightful and very
useful email. The LADMEA header file didn't make clear to me what his
concerns were; this email has.

>The essential idea is to extend from approaches like LAAGA/JACK, GStreamer
>or aRts where there is an all-encompassing audio infrastructure (an
>"exchange") into which various client applications are slotted. In the
>LADMEA world, "clients" and "exchanges" communicate through the lightweight
>LADMEA interface. Clients can load exchanges, exchanges can load clients,
>some other program can load both (when in library form).

This seems like a noble goal. It suffers, IMHO, from one major
flaw. There is a level of "abstraction" and "API independence" that it
is not worth moving beyond.

Lets just briefly review the goals here: we want streaming media
(audio, MIDI, video, whatever) to be able to talk to each other. we
don't care about the internal processing networks that a program
uses. we care only that program A can route data to program B. unix
pipes, the canonical method of doing this, suffer because they don't
have enough bandwidth and involve needless data copying (with
resulting cache pollution); shared memory, which solves these
problems, cannot be used without another mechanism to serve as a
thread wakeup signal. ergo, we need a new mechanism.

Designing a system into which JACK, GStreamer and aRts can all fit
seems great, but I immediately fine myself asking "is it worth it?"
For example: GStreamer is self-avowedly an internal application design
structure, not an inter-application design structure, and so it really
doesn't apply to the kinds of problems that are being addressed
here. aRts and JACK ostensibly exist to solve mostly the same problem
set. The claim is that aRts does not solve it and cannot solve it
without a substantial rewrite. Although there is a history of APIs
written to cover a variety similar-goaled, but differently implemented
APIs, its not something that has tended to interest me very much. Its
like the situation with pthreads and the wrappers for it: why bother?
pthreads is a well designed, typically well implemented API and there
isn't much reason to use a wrapper for it that i can see. buts that
just me :)

>This would mean that a newly written audio synth written using LADMEA could
>immediately be slotted into LAAGA, GStreamer or aRts (assuming LADMEA
>support) or use a LADMEA interface to ALSA. Correspondingly, if someone
>writes a new framework for audio streaming over ATM (perhaps using
>compression on some channels as bandwidth requires) then this can
>immediately be used with the client applications such as recorders, players,
>multitracks etc.

The problem with these claims is that they are equally true of JACK
all by itself. These goals/abilities don't differentiate LADMEA and
JACK in any way. This is all true of the remaining claims in the
paragraph.

>1. How can a client transmit a data type to another (potentially remote)
>client across a exchange that has never heard of the data type?
>2. How can a client know what existing channels it can use?
>3. If clients offer variants of data types (e.g. 16bit unsigned
>little-endian PCM, float l-e PCM, double l-e PCM, MP3), how can the exchange
>persuade the clients to agree? If they cannot, how can the exchange go about
>inserting a codec? Note again that the exchange may never have heard of the
>data types involved.
>4. How does a exchange know how much bandwidth it is likely to need to
>communicate data (e.g. over a network)? How latent can data be? How much
>jitter may be tolerated on a network? Note again...
>5. How can an exchange know when a client has failed?
>6. If sources go out of sync (e.g. audio streaming over an ISDN link acros
>s
>the Atlantic drifts in sync relative to local overdubs) then how does the
>exchange know this is happening and deal with it. (Ok, most exchanges won't,
>but they could if they wanted.)
>7. Consider a case where none of the clients in a graph require live
>operation, e.g. a MIDI player is playing a 10min piece where the MIDI is
>sent to a soft synth which generates PCM audio that is then passed to a
>sound file writer. Say this graph can be run live and uses 20% CPU time on a
>single CPU system. An intelligent exchange should be able to work out that
>the graph instead can be run "offline" in 2mins as there is no requirement
>to synthesise in real time. The same thing applies for subgraphs and
>exchanges should be allowed access to enough information to cache output
>from subgraphs. Again, many exchanges wouldn't wish to support such
>facilities - but there ought to be a way to make these things options.

These are all excellent questions. My response to most of them is that
90% of more of the clients involved should not and would not care
about the answers to any of the questions. They are questions for what
I have termed "drivers" in JACK (which is generally a client, but it
plays a special role in that it provides access to a genuine h/w
resource, such as an audio interface or network interface or video
interface or whatever). the role of a conventional client frees it
from any concern with almost all of the above questions. Its only role
is to process/provide data when told/asked for it. I will return to a
couple that are still an issue for me.

Remember, when I started trying to implement JACK, my other goal was a
toolkit that would abstract away vast amounts of the task of dealing with
audio. This means that questions such as "how do i handle 16 bit
little endian data when someone sends it to me" vanish: there is a
single format for all audio. This is the approach adopted in every
other callback-based system that we have seen. This also removes
issues about "which channels can I write to?", because the answer is
"any channel that can be written to". It means that questions about
codecs are irrelevant, because in JACK-space (not necessarily on the
outer "edges" of drivers though), all audio in non-compressed,
normalized floats.

The one remaining question that bothers me concerns data types that
are not audio, especially where the timing requirement differ. We've
already seen great difficulty in programs that attempt to sync MIDI
and audio i/o, because the audio i/o is block-structured and the MIDI
data is typically event-based (i.e. non-streaming). Unfortunately, as
I've commented before, there is no easy way around this. Combining two
data streams, one that streams and the other that is event based, and
running them from a single clock, or even in the audio/video case, two
streaming data sets with different clock rates, will inevitably lead
to problems. We can try to minimize them as much as possible, but they
will never vanish. As for the data types themselves, I believe that
although the code is not fully implemented for it, JACK includes a
design to allow the handling of *any* data type by allowing a client
to provide handlers for it.

>What worries me about LAAGA/JACK is that a very specific approach is taken
>to exchange implementation that isn't right for some tasks (e.g. bandwidth
>management, networking, handling of varied data types) although it is very
>good at some other things (e.g. low latency PCM audio).

I partly see your point here, and I partly disagree. Remember that
JACK has *absolutely* nothing to do with the scheduling and delivery
or data types, with the one exception being that it builds in an audio
type (which could be avoided, in fact, but at a performance cost in
the by-far-most-common case). Its only role, to use a LADMEA term, is
as an exchange: a "driver" tells the JACK server "its time to get your
clients to handle the data corresponding to N frames of audio", and
the server does that. The driver may have lied. It may be buffering
data up the wazoo. It may be routing data over a network that has no
"PCM clock" like an audio interface. JACK doesn't care: its only job
is to interconnect and schedule things that need to be scheduled, with
the "when" and "how much" of scheduling being left to a "driver".

> What I'd like us to
>investigate is a way forward in which clients can still make use of
>LAAGA/JACK when these features are the user's priority, but don't restrict
>the exchange to this form.

This gets to the heart of my feelings about this. What LADMEA is
trying to do is to define something at a level "above" that of
JACK. The functionality we've talked about in JACK to date is not the
province of LADMEA as I understand it. LADMEA operates on/with
exchanges (JACK being an example of one such) but doesn't define how
they work internally as long as they meet the requirements (API) of
being an exchange. Ergo, the design goals of JACK are in large part
orthogonal to those of LADMEA. The only areas where they touch are
LADMEA's attempt to define characteristics of an exchange which from
JACK's point of view are irrelevant. So fine - JACK can just make
something up (as long as its true). The more I think about it, LADMEA
doesn't really have much to do with JACK at all.

                            In particularly, IMHO remote inter-application
>communication for audio software is HUGELY important for the future and I
>think the Linux audio community really needs to address it sooner rather
>than later. This API should be perfectly adequate for a "rewire"-style
>exchange. It also should be fine for a world in which a studio is made up of
>a network of rackmount Linux boxes, MIDI controllers, SP/DIF connections and
>ADSL connections to other studios.

Can you show me one area where JACK makes this difficult to do? I
don't see any.

--p


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Sun Sep 30 2001 - 20:58:22 EEST