Subject: Re: [linux-audio-dev] PTAF link and comments
From: Laurent de Soras [Ohm Force] (laurent@ohmforce.com)
Date: Thu Feb 06 2003 - 13:58:20 EET
Late reply, I was quite busy and english writing takes me
time. I commented various people quotes here. However I
haven't read recent posts yet, I'm a bit overhelmed by
the reply avalanche. :)
> If you have only one plugin per binary, it doesn't matter much, but
> single file plugin packs are really rather common and useful. It
> allows plugins to share code efficiently, and thus reduces cache
> thrashing. It also allows plugins to share "live" data in various
> ways.
Yes the factory feature was also made for this concept.
> I'm questioning whether
> having a simpler query based system may be easier. I don't like the idea
> that you have to instantiate to query, though.
Many plug-in standards require instanciation before fetching
properties. PTAF must be able to make good wrappers for existing
plug-ins, industry requires it.
> Yeah, I've thought about that too. Rather hairy stuff. The only easy
> way to avoid it is to strictly define what the host should tell the
> plugin before asking anything, and then just leave it at that.
We can also add somewhere a watchdog limiting the recursion.
> [...the "sequencer" thread...]
As suggested we can call it "host", or "main" thread. In my opinion
generic forms of hosts have several "virtual" threads :
- Audio: handles audio processing and related sequencing
- Main/Background: handles generic operations, state transitions,
as well as background disk streaming, data preparing for audio
thread, etc.
- GUI thread (at least on Win and Mac OSs), can theoretically
be merged with the main thread, but it's not recommended (Mac
programmers have the bad habit to poll the mouse within a
blocking loop when user is holds a click).
> You generally can't have multiple
> toolkits running in the same process, even if you try running them in
> different threads.
Ahhh I feared that.
> Well, you might still be able to comment on an idea: Splitting up the
> GUI in two parts; one that runs in the same process as the DSP
> plugin, and one that runs in another process, possibly on another
> machine.
Yeah, I had this in mind when i was talking about lauching the
GUI from the plug-in.
> You would need a feature that allows control outputs to be marked as
> "active" or "passive". This allows hosts to control how things are
> handled.
> If I see a knob that is turning via some old automation, I
> should be able to grab it in the UI and take over. It is not clear to me,
> however, whether the host should remember that and leave control with the
> highest prio controller, or just make it temporary. This needs more
> thought. It may not be the domain of the API at all, but the host.
But the token system is just about the same concept, isn't it ?
For a given parameter, there is at most one active client at a
time, others are passive and receive only change notifications.
Arbitration is done by host, who is allowed to deprive a client
from his token to give it to a client requesting it.
For example it makes sense to give less priority to the
automation system than to user via GUI, so the latter can
steal its token, just by requesting it. When user has released
the knob, it give back the token to the host making it
available for any use (can transmits it implicitly to the
automation system).
> I actually think having to try to *call* a function to find out
> whether it's supported or not is rather nasty...
Good point.
Anyway I would like to minimize the number of functions.
Functions are handy when the calling conventions
of both parts are compatible. If the language you use
doesn't support this convention natively, all functions
have to be wrapped, which is a real pain.
More, there are still issues with existing plug-ins, reporting
wrongly if they support some opcode or not (talk to this issue
to Plogue people :).
> I think this separation is a mistake. Of course, we can't have an API
> for complex monolith synths scale perfectly to modular synth units,
> but I don't see why one should explicitly prevent some scaling into
> that range. Not a major design goal, though; just something to keep
> in mind.
The separation between coarse/fine grained plug-ins was just
a design goal for the API. Fine grained plug-ins would have
more issues, like the ability to do fast sample-per-sample
processing (for feedback loops).
> I do believe a single "in-place capable" flag would be a rather nice
> thing to have, though, as it's something you get for free with many
> FX plugins, and because it's something hosts can make use of rather
> easilly if they want to. Not an incredible gain, maybe, but the
> feature costs next to nothing.
Seems OK for me.
> How about defining buffer alignment as "what works good for whatever
> extensions that are available on this hardware"...?
Good, but it requires the host to be updated when new hardware
is released. We can add a host property indicating the current
alignment so the plug-in can check it before continuing.
> you'll get a lot of arguments about that on this list - linux people tend
> to have a 486 or pentium stashed somewhere. :)
Professional musicians and studios have recent hardware, at
least it's what survey results show us. Recent audio software
produced by the industry also tend to be hungry, requiring
fast hardware - I don't say it's good, it's just a fact, and
industry is the main target for PTAF.
If you plan to make a standard, it's better to build it for
having use over years. After all, MIDI is still here and will
probably last 20 years again. OK things are a bit different
with pure software protocols, but arguing little performance
boost against obvious programming safety is pointless for me
(sadly programmer's brain doesn't follows the Moore law).
> Agreed - but they also attach value to efficient software.
Yes, but given my experience with customer support, unsafty
displeasure is one or two orders of magnitude above efficency
pleasure.
> That's one way... I thought a lot about that for MAIA, but I'm not
> sure I like it. Using events becomes somewhat pointless when you
> change the rules and use another callback. Function calls are cleaner
> and easier for everyone for this kind of stuff.
The advantage of events is that you can have several of them at
once, giving the plug-in the possibility to optimise operations.
> We've just spoken of calling the process() function with a
> duration of 0 samples. Events are then processed immediately,
> and no second API is needed.
Dunno if this factorisation is really good. Different goals,
different states for call, it should be a different opcode/func.
> Anyway, I'm nervous about the (supposedly real time) connection event,
> as it's not obvious that any plugin can easilly handly connections in
> a real time safe manner.
Don't mix up audio configuration (which is not real-time)
and connections. In most of cases I see, (de)connections
is reflected in the plug-in by changing a flag or a variable,
making the processing a bit different. For example a plug-in
has a 2-in/2-out configuration. When only one input or output
channel is connected, it gives the ability to make the
processing mono, or to (de)activate vu-meters, etc.
> This is what I do in Audiality, and it's probably the way it'll be
> done in XAP:
> 1) This violates the basic principles of timestamped
> events to some extent, as you effectively have a
> single event with two timestamps, each with an
> explicit action to perform.
Right. Actually your solution is fine, efficient and simple :)
> narrowminded. What'n wrong with 1.0/octave?
Nothing. I even prefer that.
> This is where it gets hairy. We prefer to think of scale tuning as
> something that is generally done by specialized plugins, or by
> whatever sends the events (host/sequencer) *before* the synths. That
> way, every synth doesn't have to implement scale support to be
> usable.
That's why Base Pitch is float. But because MIDI is still there,
existing plug-ins will be happy to round (after scaling if 1/oct)
Base Pitch to get the note number.
> As it is with VST, most plugins are essentially useless to people that
> don't use 12tET, unless multiple channels + pitch bend is a
> sufficient work-around.
VST has also a finetuning delivered with each Note On.
> And a continous pitch controller *could* just use a fixed value for
> Base Pitch and use Transpose as a linear pitch control, right?
Absolutely.
> The only part I don't like is assuming that Base Pitch isn't
> linear pitch, but effectively note pith, mapped in unknown ways.
However many (hard/soft) synth can be tuned this way. It is a
convenient way to play exotic scales using traditional keyboard,
piano roll or other editors / performance devices.
> Right, but MIDI is integers, and the range defines the resolution.
> With floats, why have 2.0 if you can have 1.0...?
To make 1.0 the middle, default position.
> ick - if there is a balanced (or even unbalanced) spectrum of values, center
> it at 0, please :)
No, it makes sense for true bipolar data, but here 0 or close
to 0 values have their meaning for parameters like velocity.
It's not exactly like the volume.
> Anyway, what I'm suggesting is basically that this event should use
> the same format as the running musical time "counter" that any tempo
> or beat sync plugin would maintain. It seems easier to just count
> ticks, and then convert to beats, bars or whatever when you need
> those units.
But how do you know on which bar the plug-in is ? Tempo and
time signature can change in the course of the song and it
is not obvious to deduce the song position from just an
absolute time reference or a number of ticks or beats.
> Yes. And bars and beats is just *one* valid unit for musical time.
> There's also SMPTE, hours/minutes/seconds, HDR audio time and other
> stuff. Why bars and beats of all these, rather than just some linear,
> single value representation?
> Could be seconds or ticks, but the latter make more sense as it's
> locked to the song even if the tempo is changed.
For reasons mentioned above, both time in second an musical
representation are needed. SMPTE etc can be deduce from the
absolute time in seconds or samples.
> Accuracy is *not* infinite. In fact, you'll have fractions with more
> decimals that float or double can handle even in lots of hard
> quantized music, and the tick subdivisions used by most sequencers
> won't produce "clean" float values for much more than a few values.
But is that so important ? Float or int, we manipulate numbers
with a finite precision. And if someone is happy with tick
scales multiple of 3, someone else will complain about not
supporting multiples of 19, etc.
For the drifting question, it is neglictible with double, and
can be easily resync'd if required. Ohm Force's plug-ins use
double as positions and steps for the LFO phases and it works
fine without resync during hours, on any fraction of beat.
If you need exact subdivisions, you can still convert the
floating point value into the nearest integer tick.
> The host would of course know the range of each control, so the only
> *real* issue is that natural control values mean more work for hosts.
No it just would store and manipulate normalized values.
Natural values would be needed only for parameter connection
between plug-ins, if user chooses to connect natural values.
> What's wrong with text and/or raw data controls?
They often require a variable amount of memory, making
them difficult to manage in the real-time thread.
However there is a need for this kind of stuff for sure.
I thought more about encapsulating them into specific
events.
> 1) none - host can autogenerate from hints
> 2) layout - plugin provides XML or something suggesting it's UI, host draws
> it
> 3) graphics+layout - plugin provides XML or something as well as graphics -
> host is responsible for animating according to plugin spec
> 4) total - plugin provides binary UI code (possibly bytecode or lib calls)
I agree with 1) and 4). For my experience, when a plug-in
has 4), it has also 1), but it's not *always* the case
because developers don't feel necessary to implement the
hints correctly (dev time matters). Adding more UI
description is likely not to be supported by either host
or plug-in, making them useless. Good solution would
be using portable UI libs, as someone suggested here,
but everyone knows it's the Holy Grail.
> The spec should not dictate ABI, either. The ABI is an articat of the
> platform. If my platform uses 32 registers to pass C function arguments, it
> is already binary imcompatible with your PC :)
The only way to guarantee real portability through platforms
and programming languages is to describe the API at the byte
level. This include calling convention. IMHO We can assume
that every target system has a stack, and can pass parameters
on it.
> I don't know if it is useful to spec UTF8
Well there are many different languages in the world and ASCII
character set is good for US people but doesn't make sense for
Japanese, Korean, Indian, Chinese, Arabic, etc. It is also
limited for many European languages (loss of accents or
character decorations).
If you want to generate only ASCII in your program, just write
only ASCII, it will be supported by UTF-8. However if you have
to read strings, it's better to be aware that it's UTF-8,
because incoming strings may contain multi-byte characters.
> I don't like having to call normalize functions for every
> translation, though. Can the translation overhead be minimized?
It's not a big overhead if done efficiently. VST works this
way for years, and people don't complain about this side at
all. Host just needs to store and pass normalized parameter,
so overhead is not here. It makes even things way simpler
because data is alawys [0;1]/float, which wouldn't be the
case with natural or multi-type parameter handling.
On the plug-in side, overhead depends just on the conversion
formula, and it's possible to make it very fast (segmented
polynomial mapping for example).
-- Laurent
==================================+========================
Laurent de Soras | Ohm Force
DSP developer & Software designer | Digital Audio Software
mailto:laurent@ohmforce.com | http://www.ohmforce.com
==================================+========================
This archive was generated by hypermail 2b28 : Thu Feb 06 2003 - 13:59:58 EET