Subject: Re: [linux-audio-dev] LADMEA revisited (was: LAAGA and supporting programs)
From: Paul Davis (pbd_AT_Op.Net)
Date: Wed Oct 03 2001 - 01:12:23 EEST
>Yep, but SHM isn't the only answer. Sockets provide a particularly useful
>implementation.
Sockets still force a data copy and they are bandwidth limited as well
AFAIK. They also don't offer RT characteristics. Try to write a lot
of data to a socket, and the kernel will do a bunch of
non-deterministic memory allocation to accomodate you.
>I'm not sure I agree here - I think the GStreamer team has a better grasp of
>the real issues here than most. And by the way, why can't Ardour be turned
>into a GStreamer plugin? The GStreamer team would probably appreciated the
>changes required to make that happen.
I've talked with both the GStreamer people and also with someone
involved in the prehistory of GStreamer. GStreamer doesn't have the
same design goals as JACK. It had an even more different design goals
when it started (mostly a way to buffer jitter in a media stream).
The challenges of making it into an RT-friendly, audio h/w friendly
framework that worked across application boundaries amount, in my
mind, to a complete reimplementation (and perhaps more).
>> aRts and JACK ostensibly exist to solve mostly the same problem
>> set. The claim is that aRts does not solve it and cannot solve it
>> without a substantial rewrite. [ ... ]
>
>I don't follow this. Whose claim is this? I'm not sure what the pthread
>analogue is here.
Thats the claim that has emerged in discussions here on LAD. the point
about aRts is that KDE has already adopted it with the claim that it
can be used as a "sound server" to do the kinds of things we are
talking about. Its author certainly intended it to do that, and it
does. It just fails fairly badly in the multichannel, low latency
case, and for reasons that are not superficial and simple to fix.
The pthreads analogy: people say "why not use a thread API that isn't
pthreads-specific, so you can use it on Win32, MacOS, BeOS ..." My
answer is that from my perspective, P.1003 *IS* the multiplatform,
portable thread API, and I choose to use it rather than another
wrapper (such as g_thread). The fact that certain OS's have not
correctly implemented it, if at all, isn't of much concern to me when
its such a good, well defined, well documented, and above all well
designed API.
Likewise, I'm not interested in any proposals for an API that "wraps"
JACK, aRts, GStreamer, GLAME, etc. etc. If there is an API out there
that does the right thing, is well designed, well implemented, well
documented, then I feel one should use it in preference to a wrapper.
I'm not implying that LADMEA is "just a wrapper". Just that if thats
the motivation for it, I'm not buying. I think, however, I understand
other motivations, so ...
>> These are all excellent questions. My response to most of them is that
>> 90% of more of the clients involved should not and would not care
>> about the answers to any of the questions. They are questions for what
>
>Yep, precisely. But the exchange needs some information about what the data
>is. As long as you're dealing with PCM audio this isn't too difficult. And
>even then, I'm not sure how you'd address all the points above. How would a
>clever driver handle the points above?
Let me go through them one by one:
>>1. How can a client transmit a data type to another
>>(potentially remote) client across a exchange that has never heard
>>of the data type?
the exchange doesn't have to know about the data type. only the
clients do. one or both clients provide the "extension" (to use your
term) to handle the data. the code is executed when the clients
exchange data.
>> >2. How can a client know what existing channels it can use?
jack_get_port_list (...)
>> 3. If clients offer variants of data types (e.g. 16bit unsigned
>>little-endian PCM, float l-e PCM, double l-e PCM, MP3), how can the
>>exchange persuade the clients to agree? If they cannot, how can the
>>exchange go about inserting a codec? Note again that the exchange
>>may never have heard of the data types involved.
there is no negotation. a client advertises its ports with fixed
types. if some other client can't handle it, they don't connect.
in your terms, the "exchange" is defined by the clients that use the
data, with the exception of PCM audio which is optimized for the
common case.
>> >4. How does a exchange know how much bandwidth it is likely to need to
>> >communicate data (e.g. over a network)? How latent can data be? How much
>> >jitter may be tolerated on a network? Note again...
None of these questions are meaningful for a client. It is told only
to produce/consume data corresponding to some interval. It has no
idea where its data goes, unless its a driver, in which case its
connected directly to the h/w resource, and can answer these questions
for itself.
>> >5. How can an exchange know when a client has failed?
if its process() callback takes too long, the client is removed from
the graph. the graph will fail to meets its deadline on that
particular execution cycle, but then it return to normal.
>> 6. If sources go out of sync (e.g. audio streaming over an ISDN
>>link acros s the Atlantic drifts in sync relative to local overdubs)
>>then how does the exchange know this is happening and deal with
>>it. (Ok, most exchanges won't, but they could if they wanted.)
you can't handle jitter without buffering. if there is a client
suffering from significant jitter, then the JACK server needs to be
clocked by a driver that will allow the clients sufficient buffering
to hide the jitter. in general, clients cannot go out of sync in the
way you partly imply: they always produce and/or consume a fixed
amount of data per process() callback invocation.
>> 7. Consider a case where none of the clients in a graph require
>>live operation, e.g. a MIDI player is playing a 10min piece where
>>the MIDI is sent to a soft synth which generates PCM audio that is
>>then passed to a sound file writer. Say this graph can be run live
>>and uses 20% CPU time on a single CPU system. An intelligent
>>exchange should be able to work out that the graph instead can be
>>run "offline" in 2mins as there is no requirement to synthesise in
>>real time. The same thing applies for subgraphs and exchanges should
>>be allowed access to enough information to cache output from
>>subgraphs. Again, many exchanges wouldn't wish to support such
>>facilities - but there ought to be a way to make these things
>>options.
just clock the server from a driver that isn't wired to an audio
interface, but instead does file i/o. it can call process() as rapidly
as it can pass of the data to some file i/o method (either write(2) or
a buffer with a thread on the other side of it).
----------
more clear?
>Yep, this is why LADMEA is more complex. The "M" is for multimedia. We need
>to be able to stream other data types. MP3, MIDI, Csound instrument calls,
>video, DFTs, assorted techniques I'm working on (don't ask).
if client A wants to receive data type D from client B, then client A
and B must necessarily agree on what type D implies. for this reason
alone, i believe that data handling needs to be provided by clients,
not the server, except for types that are judged to be sufficiently
common case that building them into the "exchange" is a sensible
decision.
>> already seen great difficulty in programs that attempt to sync MIDI
>> and audio i/o, because the audio i/o is block-structured and the MIDI
>> data is typically event-based (i.e. non-streaming). Unfortunately, as
>
>If streaming means "one second has a fixed data size" then we're in trouble.
No, it means that if I call:
your_client->process (some_interval)
then your_client will produce/consume the data corresponding to the
interval. that could be more or less data than a PCM audio stream
would be, but it *has* to be the data that corresponds to the interval.
>Yep, but if the client provides no information to JACK about the data type,
>then the driver has no clue to determine what to do with it.
you must be mis-understanding something.
drivers are clients too. they possess ports, input and/or output, with
specific types. the only ports that can be connected to their ports
are ones whose type matches. an alsa i/o driver can only receive PCM
audio data, for example. it does not have any ports that receive, say,
video data.
remember, drivers typically two roles:
* one of them acts as the "tick" for the server
* they all have the option of being clients and thus
producing/consuming data via ports.
>> Can you show me one area where JACK makes this difficult to do? I
>> don't see any.
>
>Take your pick: MP3, MIDI, random control data, encodings of spatial
>movement, windowed DFT, compressed video, event data.
Other than issues with shm allocation in the server and the client
living at the same address in both address spaces, I don't see that
JACK has any problems handling any of these, other than the ones that
arise from the interval size potentially being too large for some of
the data types to be delivered "on time" (e.g. MIDI driven by audio).
--p
This archive was generated by hypermail 2b28 : Wed Oct 03 2001 - 01:08:56 EEST