[linux-audio-dev] high level laaga

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: [linux-audio-dev] high level laaga
From: Paul Davis (pbd_AT_Op.Net)
Date: Fri May 11 2001 - 15:34:39 EEST


Having read Kai's excellent summary/overview of the problem space, I
have some high level observations.

-- Problem space -------------------------------------------

We can divide the problems LAAGA seeks to solve into several distinct
problems:

0) Establishing common formats

  Things get much more tricky, and much more wasteful, if the
  components in a LAAGA setup do not use a single format to represent
  audio data.

1) Abstracting complexity

  Even with alsa-lib to manage things for us, interacting with
  audio hardware is a non-trivial task in all but the simplest
  application. It is not sensible for every audio application author
  to have to deal with this over and over again. So LAAGA also seeks
  to present a vastly simplified and abstracted interface to
  audio hardware. One possible model is that of a series of
  mono IEEE 32 bit floating point data streams running at a fixed
  sample rate. This kind of model is easy to support in an
  environment where all access to the audio interface occurs
  synchronously with the interface's "interrupt" state.

2) Establishing connections

  How do we get one application to be *able* to talk to another? In
  the current state of affairs, we have a bunch of applications using
  a couple of API's that do not include the idea of sharing data in a
  particularly carefully thought out way (or not at all in the case of
  OSS). They can talk to the hardware with great efficiency, but they
  can't talk to each other.

  Kai's site describes the problems with using IPC for the
  interconnection. So we need something else.

3) Transferring data
 
  Once we are able to provide a method by which applications are
  set up to talk each other, we need to provide a method by which
  they can actually transfer data back and forth.

  We also need a mechanism, preferably the same one, by which they
  can transfer data to the audio interface managed by the LAAGA server.

4) Temporal synchronization

  All the components in a LAAGA setup should have the same notion
  of the current audio time.

--- the role of ALSA ---------------------------------------

Abramo has been vocal in his support for using ALSA's user-space
alsa-lib API to provide the functionality. His appeals to the notion
of using a single API do have some real persuasiveness (all
appearances on my part to the contrary, its true).

But lets take a look at the problem space described above.

Audio formats: alsa-lib deliberately and appropriately allows
               its applications to use a vast multiplicity
               of audio formats.

Abstraction: alsa-lib already provides a significant level
               of abstraction over real h/w issues, yet
               the typical real-time audio application
               written using alsa-lib is a fairly complex
               affair, with more than a dozen potential
               parameters to be set before starting
               the audio interface, and a multiplicity of
               ways to transfer data to and from it.

Establishing connections: alsa-lib offers one model for this,
               based on the shared memory IPC mechanisms.
               Such a solution cannot scale to setups with
               lots of clients running at low latencies because
               of the overhead of context switching and the
               associated memory/cache performance loss.

Transferring data: alsa-lib offers a variety of functions to
               effect data transfer.

Temporal sync: alsa-lib offers nothing in this area. ALSA has
               focused all its sync code in the sequencer, which
               is not an audio API (though it would not be impossible
               to modify that). even there, however, the notion
               of time is fairly low resolution, definitely
               not enough to support sample-accurate events.

Before moving on, I want to stress that I think that alsa-lib is a
fantastic piece of work. Truly fantastic.

However, its goals have clearly (and appropriately) been flexibility
and power. In the LAAGA problem space, we have a focus on simplicity
and performance instead. In reviewing what alsa-lib can provide, i
feel confident that the only parts of the current API that might be a
part of a LAAGA-like system are the data transfer calls and perhaps
the trivial model of snd_pcm_open() (whose functionality would be
extremely different in a LAAGA setup).

In particular, alsa-lib does not contain any callback-based model, and
in the majority of discussions about LAAGA, the notion of the server
executing a callback for each plugin has been fairly central. This is
in turn a reflection of the notion that a LAAGA setup is driven
exclusively by the interrupts from the audio interface - although
other constraints may exist for certain plugins, they are not
permitted (in a "legal" LAAGA plugin) to play any role in the
timing of the execution of the server.

---- so? ---------------------------------------------------

Now, this then creates an uncomfortable issue. I think that alsa-lib
is a superb piece of work, but a week or so after Kai brought up the
LAAGA idea (though he credits me with a post that started it all :), I
find myself reflecting on whether alsa-lib is the API that 99% of
audio applications should be using at all. The LAAGA model as mapped
out here and on Kai's site is a model of simplicity, and one that has
worked extremely well in analogous forms for many other
systems. Notable examples include both xmms and its plugins, and VST
(FX plugins, instruments), and these days, DirectX for Windows.

I don't know this for sure, but I would guess that a very large
majority of new code written for Windows and MacOS audio is written
around an interface based on synchronous callback execution by a
plugin host. The problems with this on those platforms is mostly that
the actual OS and API implementation there leave much to be desired in
the way of support for low latency and/or reliable operation.

Obviously, this is not true of all audio applications. As far as I
know, most trackers are themselves "plugin" hosts, and tend to use
their platform's audio device API directly, and this is also true of
most soundfile editors and the like. But code aimed at processing
or generating audio in real time seems to be dominated by plugins, not
applications.

As I have explained above, the current design of alsa-lib doesn't
contain the elements I think are important for a plugin-oriented
system. That doesn't mean it can't be changed to include that, but it
would be a part of alsa-lib that for processing/generating code would
make the rest of alsa-lib completely irrelevant. such code would never
use any of the existing API except possibly for some of the transfer
functions, and probably not even them, since there is a distinct bias
in plugin systems towards allowing direct memory access to input and
output buffers (i'm not talking about mmap, just the LADSPA audio port
idea).

--- the future? ---------------------------------------------------------

so, i see a future with two distinct aspects in it. one contains a set
of applications that continue to expect to control their own audio
interface. These would include "servers" like a LAAGA, LADPSA or VST
host, or applications that operate in a non-synchronous way (almost
all soundfile editors are a good example), do not have particular
real-time and/or low latency requirements, and/or contain legacy code.

the other consists of plugins much like the ever expanding set of VST
and DirectX plugins, none of which are concerned with the details of
audio formats, hardware or software parameters, but just get loaded
into a server of some kind, process some number of float-based audio
on a regular basis, and thats about it.

i'm not sure that this division is a good thing, and its why Abramo's
appeal to a single API has some force for me. but i'm also at a loss
to see how to come up with a single API that can satisfy all the goals
and be seductive enough to convince all/most linux audio developers to
use it.

--p


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Fri May 11 2001 - 15:50:40 EEST