Re: [linux-audio-dev] Plugin APIs (again)

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] Plugin APIs (again)
From: David Olofson (david_AT_olofson.net)
Date: Thu Dec 05 2002 - 01:35:18 EET


On Wednesday 04 December 2002 22.33, Tim Hockin wrote:
> > > I disagree with that - this is a waste of DSP cycles processing
> > > to be sent nowhere.
> >
> > So, why would you ask the plugin to set up outputs that you won't
> > connect, and then force the plugin to have another conditional to
> > check whether the output is connected or not?
>
> This confuses me. A plugin says it can handle 1-6 channels. The
> host only connects 2 channels. The plugin loops for i = 0 to i =
> me->nchannels. There isn't any checking.

If you have a "group" of channels, and just want to skip one in the
middle, that won't work.

> If the plugin says it can
> handle 2-6 channels and the host only connects 1, it is an error.
> Connect at least the minimum, up to the maximum. In typing this,
> I've seen that discontiguous connections do, in fact, require
> condionals.

Yes, that's exactly what I'm thinking about.

> Maybe it is safe to say you have to connect ports in
> order?

Safe, bit it wouldn't be sufficient. Consider my mono->5.1 example.

> > I would propose that the pre-instantiation host/plugin
> > "negotiations" including:
> >
> > * A way for the host to tell the plugin how many ports of
> > each type it wants for a particular instance of the plugin.
>
> This is exactly what I'm talking about with the connect methods.
> Before we go into PLAY mode, we ask for a certain number of
> channels.
>
> > * A way for the host to *ask* the plugin to disable certain
> > ports if possible, so they can be left disconnected.
>
> hmm, this is interesting, but now we're adding the conditional

Well, the point is that the conditional doesn't have to end up in the
inner loop of the plugin. The host could throw in a silent or garbage
buffer, if the plugin coder decides it's too hairy to implement the
plugin in a different way.

Then again, the plugin could use a private array of buffer pointers,
and throw in silence/garbage buffers itself. The buffers should still
be supplied by the host, though, to reduce memory use and cache
thrashing. (This is just hopefully just a "rare" special case, but
anyway...)

> > plugin with two 1D, contiguous arrays (although possibly with
> > some ports disabled, if the plugin supports it); one for inputs
> > and one for outputs. That will simplify the low level/DSP code,
> > and I think
>
> Yes, I've come around to this. The question in my mind is now
> about disabling (or just not connecting) some ports.
>
> > Now, if the plugin didn't support DisableSingle on the output
> > ports of type Out;5.1, you'd have to accept getting all 6 outs,
> > and just route the bass and centel channels to "/dev/null". It
> > should be easy enough for the host, and it could simplify and/or
> > speed up the average case (all outputs used, assumed) of the
> > plugin a bit, since there's no need for conditionals in the inner
> > loop, mixing one buffer for each output at a time, or having 63
> > (!) different versions of the mixing loop.
>
> ok, I see now. If the plugin supports disabling, the host can use
> it. If the plugin is faster to assume all ports connected, it does
> that instead.

Yes, that's the idea. And for the host, these are decisions made when
building the net, so it doesn't matter performance wise. It's code
that needs to be there, indeed - but it's simple and generic enough
that it could go in the host SDK. (Like that state changing function
of Audiality, for example; it comes with the plugin API.)

> I think I rather like that.

Good - then it might not be totally nonsense. :-)

> > think it's a bad idea to *require* that plugins support it.
>
> This is key, again, you've convinced me.
>
> > strongly prefer working with individual mono waveforms, each on a
> > voice of their own, as this offers much more flexibility. (And
> > it's also a helluva' lot easier to implement a sampler that way!
> > :-)
>
> just so we're clear, 'voice' in your terminology == 'channel' in
> mine?

Well... If your definition of channel is like in the (classic)
tracker days, yes. What I call a voice is what plays a single
waveform in a synth or sampler. Depending on the design, it may only
have waveform and pitch controls - or it may include filters,
envelope generators, LFOs, distortion, panning and whatnot.

In fact, a voice could theoretically even combine multiple waveforms,
but that borders to something I'd call a "voice structure" - and that
should probably be published as multiple voices in a voice oriented
API.

I Audiality however, a 'channel' doesn't have a fixed relation to
audio processing. It's basically like a channel in MIDI speak, and
when dealing with notes, you're really dealing with *notes* - not
voices. (Remember the CoffeeBrewer patch? ;-)

Anyway, my original comment was really about synth/sampler
programming, where I prefer to construct stereo sounds from multiple
mono samples (each sample on it's own voice), as opposed to working
with voices that are capable of playing stereo waveforms.

That said, Audiality supports stereo voices. Don't know if I'll keep
that feature, though. It's a performance hack for sound effects in
games, mostly, and at some point, I'll probably have to sacrifice
some low end scalability for the high end.

> > ...provided there is a quarantee that there is a buffer for the
> > port. Or you'll segfault unless you check every port before
> > messing with it. :-)
>
> Do we need to provide a buffer for ports that are disabled?

Yes, if the plugin says it can't deal with disconnected ports. If it
says it can, it's supposed to check the pointers at some point during
the process() call - preferably once, before the event/DSP loop.

As to ports that are outside the number of ports requested by the
host; well those are outside the loops, and simply don't exist.

[...]
> > > { "left(4):mono(4)" }, { "right(4)" },
> >
> > Does this mean the plugin is supposed to understand that you want
> > a "mono mix" if you only connect the left output?
>
> If the host connects this pad to a mono effect, it knows that the
> 'left' channel is also named 'mono'. I do not expect the plugin to
> mono-ize a stereo sample

Ok.

> (though it can if it feels clever).

This makes me nervous. :-)

Ok; it's rather handy that some synths automatically transform the
left line output into a mono output if the right output is not
connected (the JV-1080 does that, IIRC) - but when it comes to
software, it's not like you need to drag in a mixer and cables to
implement that outside the machine. I'd really rather not have
plugins do all sorts of "smart" things without explicitly being asked
to.

[...]
> > > * note_on returns an int voice-id
> > > * that voice-id is used by the host for note_off() or
> > > note_ctrl()
> >
> > That's the way I do it in Audiality - but it doesn't mix well
> > with timestamped events, not even within the context of the RT
> > engine core.
>
> how so - it seems if you want to send a voice-specific event, you'd
> need this

No, there are other ways. All you really need is a unique ID for each
voice, for addressing per-voice.

The simple and obvious way is to just get the voice number from the
voice allocator. The problem with that is that you need to keep track
of whether or not that voice still belongs to you before trying to
talk to it. In Audiality, I do that by allowing patch plugins (the
units that drives one or more voices based on input from a 'channel')
to "mark" voices they allocate with an ID - which is usually just the
channel number.

That's not perfect, though: If you change the patch on that channel
while you have old notes hanging, you either have to kill those notes
first (JV-1080 style - not good), or you have to keep the old patch
around, so it can control it's voices until they die. The latter is
what I'm trying to do in Audiality, but it requires that patches can
recognize their own voices *even* if there are other patches still
working on the same channel.

I could mark voices with *both* channel and patch numbers, but I have
a feeling that would only work until I discover *another* way that
patches could lose track of their voices. A cleaner and more generic
solution is needed. (Especially for a similar system for use in
public plugin API!)

...And then there's still this roundtrip issue, of course.

So, what I'm considering for Audiality is 'Virtual Voice Handles'.
When a patch is initialized, it asks the host for a number of these
(contigous range), which will then function as the "virtual voice
reserve" for the patch. When you want to allocate a voice, you send a
VVH with the "request", and just assume that you got one. From then
on, when you do *anything* with a voice, you reference it through the
VVH, just as if it was a raw voice index.

If you don't get a voice, or it's stolen, it's no big deal. The synth
will just ignore anything you say about that VVH. If you ask
(roundtrip...), the host will tell you whether or not you have a
voice - but the point is you don't *have* to wait for the reply.
Worst thing that can happen is that you talk to an object that no
longer exists, and (for a change!) that's totally ok, since there's
no chance of anything else listening to that VVH.

Besides, a bonus with a VVH system is that the synth may implement
voice "unstealing" and other fancy stuff. If a VVH loses it's voice,
it can just get another voice later on. The synth would indeed have
to keep track of phase and stuff for all active virtual voices, but
that's most certainly a lot cheaper than actually *mixing* them.

[...]
> > Besides, VSTi has it. DXi has it. I bet TDM has it. I'm sure all
> > major digital audio editing systems (s/w or h/w) have it. Sample
> > accurate timing. I guess there is a reason. (Or: It's not just
> > me! :-)
>
> yeah, VSTi also has MIDI - need I say more?

Well, yes - and VST 2.0 also added quite a few API features that
overlap with VST 1.0 features, to confuse developers. The current
version is far from clean and simple when you look into the details.

I'm not saying that VST is the perfect API (in fact, I don't like it
very much at all), but looking at the *feature set*, I think it's a
very good model for what is needed for serious audio synthesis and
processing these days.

Also note that VST 3.0 seems to be in development, which means we
shouldn't consider VST 2.0 the perfect do-it-all design. It has
flaws, and it lacks features that a significant number of users want
or need. (I would be very interested in a complete list, BTW! :-)

> I'm becoming
> convinced, though.

Well, if you could find Benno (author of EVO; the Linux
direct-from-disk sampler), he could probably demonstrate the
performance advantages of timestamped event systems as well. :-)

As to Audiality, switching to timestamped events didn't slow anything
down, but it did provide sample accurate timing. As a result, it also
decouples timing accuracy from system buffer size, which means that
envelopes, fades and stuff sound *exactly* the same regardless of
engine settings.

BTW, another hint as to why sample accurate timing is critical: the
envelope generators drive their voices through the event system. This
simply does not work without concistent and accurate timing - and my
experiences with some h/w synths have convinced me that sample
accurate timing is the *minimum* for serious sound programming. If
you don't have it, you'll have to mess with destructive waveform
editing instead. (Consider attacks of percussion instruments and the
like. You can't even program an analog style bass drum on a synth
with flaky control timing, let alone higher pitched and faster
sounds.)

> > What kind of knobs need to be ints? And what range/resolution
> > should they have...? You don't have to decide if you use floats.
>
> They should have the same range as floats - whatever their control
> struct dictates.

Of course - but as a designer, how do you decide on a "suitable"
resolution for this fixed point control, which is actually what it
is? (Unless it's *really* an integer control, of course.)

> > > I'd assume a violin modeller would have a BOWSPEED control.
> > > The note_on() would tell it what the eventual pitch would be.
> > > The plugin would use BOWSPEED to model the attack.
> >
> > Then how do you control pitch continously? ;-)
>
> with a per-voice pitchbend

Yeah - but why not just ditch note pith and bend, and go for
per-voice continuous pitch? There's no need for sending the arbitrary
parameter "pitch" with every note_on event, especially since some
instruments may not care about it at all.

> > Some controls may not be possible to change in real time context
> > - but I still think it makes sense to use the control API for
> > things like that.
>
> I don't know if I like the idea of controls being flagged RT vs
> NONRT, but maybe it is necessary.

I'm afraid it is. If you look at delays, reverbs and things like
that, you basically have three options:

        1) realloc() the buffers when the certain parameters
           change, or

        2) decide on an absolute maximum buffer size, and always
           allocate that during instantiation, or

        3) realloc() the buffers when a NONRT "max_delay" or
           similar parameter is changed.

1 is out, since it can't work in real time systems. (Well, not
without a real time memory manager, at least.) 2 "works", but is very
restrictive, and not very friendly.

3 works and is relatively clean and simple - but it requires an
"extra" interface for setting NONRT parameters. I would definitely
prefer this to be basically the same as the normal control interface,
as the only significant difference is in which context the control
changes are executed.

> Or maybe it's not, and a user
> who changes a sample in real time can expect a glitch.

So you're not supposed to be able to implement delays that can take
modulation of the delay parameters, and still cope with arbitrary
delay lengths? (Just an example; this has been discussed before, so I
guess some people have more, and real examples.)

> > An Algorithmically Generated Waveform script...?
>
> BUt what I don't get is: who loads the data into the control?

Nor do I. Haven't decided yet. :-)

> a) hast will call deserialize() with a string or other standard
> format
> b) plugin will load it from a file, in which case host
> passes the filename to the cotrol
> c) host loads a chunk of arbitrary data which it read from the
> plugin before saving/restoring - in which case how did it get there
> in the first place? (see a or b)

The problem you mention in c) is always present, actually, and this
has been discussed before: It's about presets and defaults.

Do plugins have their own hardcoded defaults, or is the default just
another preset? And where are presets stored?

I'd say that the most sensible thing is that the host worries about
preset. Plugins already have an interface for getting/setting
controls, so why should they have another? Implement it in the hosts
- or even once and for all in the host SDK - and be done with it.

As to where the elusive "arbitrary data" goes, I've said I'm leaning
towards external files via paths in string controls marked as
FILE_<something>. I still think that makes sense, especially
considering that these files of "arbitrary data" might be room
impulse responses for convolution, or even bigger things.

> > Well, then I guess you'll need the "raw data block" type after
> > all, since advanced synth plugins will have a lot of input data
> > that cannot be expressed as one or more "normal" controls in any
> > sane way.
>
> Such as?

Impulse responses, compressed audio data, raw audio data, scripts,...

> Where does this data come from in the first place?

From wherever we decide to keep defaults and presets. I would suggest
that plugins should have "safe defaults" built-in (to prevent them
from crashing if the default preset is missing, at least), and that
presets are stored on disk, in some sort of database managed by the
host SDK or something. When you install a plugin, it's presets would
be added to this database.

> > Just as with callback models, that depends entirely on the API
> > and the plugin implementation. AFAIK, DXi has "ramp events". The
> > Audiality synth has linear ramp events for output/send levels.
>
> So does Apple Audio Units. I am starting to like the idea..

Well, I had my doubts at first, but it seems to work pretty well with
just "supposedly linear ramping". The big advantage (apart from not
having to send one change event per sample, or using arbitrary
control filters in all plugins) is that plugins can implement the
ramping in any way they like internally. That is, if it takes heavy
calculations to transform the control value into the internal
coefficients you need for the DSP code, you may interpolate the
coefficients instead. How is up to you - the host just expects you to
do something that sounds reasonably linear to the user.

> > > Audiality, but if we're designing the same thing, why aren't we
> > > working on the same project?
> >
> > Well, that's the problem with Free/Open Source in general, I
> > think. The ones who care want to roll their own, and the ones
> > that don't care... well, they don't care, unless someone throws
> > something nice and ready to use at them.
> >
> > As to Audiality, that basically came to be "by accident". It
> > started
>
> Interesting how it came about, but why are you helping me turn my
> API into yours, instead of letting me work on yours? Just curious.
> I do like to roll my own, but I don't want to waste time..

Well, part of the answer to that is that I'm still interested in
ideas - but of course, having a working implementation of some of my
ideas probably won't hurt the discussion!

It seems that I got stuck in this thread instead of releasing the
code... :-)

I have some space trouble on the site. I'll try to deal with it and
put the whole package on-line tonight.

//David Olofson - Programmer, Composer, Open Source Advocate

.- Coming soon from VaporWare Inc...------------------------.
| The Return of Audiality! Real, working software. Really! |
| Real time and off-line synthesis, scripting, MIDI, LGPL...|
`-----------------------------------> (Public Release RSN) -'
.- M A I A -------------------------------------------------.
| The Multimedia Application Integration Architecture |
`----------------------------> http://www.linuxdj.com/maia -'
   --- http://olofson.net --- http://www.reologica.se ---


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Thu Dec 05 2002 - 01:46:31 EET