Re: [linux-audio-dev] Catching up with XAP

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] Catching up with XAP
From: David Olofson (david_AT_olofson.net)
Date: Wed Jan 15 2003 - 14:07:30 EET


On Wednesday 15 January 2003 10.42, Tim Hockin wrote:
> > [Lost touch with the list, so I'm trying to catch up here... I
> > did notice that gardena.net is gone - but I forgot that I was
> > using david_AT_gardena.net for this list! *heh*]
>
> Woops! Welcome back!

Well, thanks. :-)

[...]
> > The easiest way is to just make one event the "trigger", but I'm
> > not sure it's the right thing to do. What if you have more than
> > one control of this sort, and the "trigger" is actually a product
> > of both? Maybe just assume that synths will use the standardized
>
> The trigger is a virtual control which really just says whether the
> voice is on or not. You set up all your init-latched controls in
> the init window, THEN you set the voice on.
>
> It is conceptually simple, similar to what people know and it fits
> well enough. And I can't find any problems with it technically.

The only problem I have with it is that it's completely irrelevant to
continous control synths - but they can just ignore it, or not have
the control at all.

> > > And the NOTE/VOICE starter is a voice-control, so any
> > > Instrument MUST have that.
> >
> > This is very "anti modular synth". NOTE/VOICE/GATE is a control
> > type hint. I see no reason to imply that it can only be used for
> > a certain kind of controls, since it's really just a "name" used
> > by users and/or hosts to match ins and outs.
>
> This is not at all what I see as intuitive. VOICE is a separate
> control used ONLY for voice control. Instruments have it. Effects
> do not.

There's this distinct FX vs instrument separation again. What is the
actual motivation for enforcing that these are kept totally separate?

I don't see the separation as very intuitive at all. The only
differences are that voices are (sort of) dynamically allocated, and
that they have an extra dimension of addressing - and that applies
*only* to polyphonic synths. For mono synths, a Channel is equivalent
to a Voice for all practical matters.

> > About VVID management:
> > Since mono synths won't need VVIDs, host shouldn't have to
> > allocate any for them. (That would be a waste of resources.)
> > The last case also indicates a handy shortcut you can take
> > if you *know* that VVIDs won't be considered. Thus, I'd
> > suggest that plugins can indicate that they won't use VVIDs.
>
> This is a possible optimization. I'll add it to my notes. It may
> really not be worth it at all.

It's also totally optional. If you don't care to check the hint, just
always use real VVIDs with Voice Controls, and never connect Channel
Control outs to Voice Control ins, and everything will work fine.

[...]
> > What might be confusing things is that I don's consider "voice"
> > and "context" equivalent - and VVIDs refer to *contexts* rather
> > than voices. There will generally be either zero or one voice
> > connected to a context, but the same context may be used to play
> > several notes.
>
> I disagree - a VVID refers to a voice at some point in time. A
> context can not be re-used. Once a voice is stopped and the
> release has ended, that VVID has expired.

Why? Is there a good reason why a synth must not be allowed to
function like the good old SID envelope generator, which can be
switched on and off as desired?

Also, remember that there is nothing binding two notes at the same
pitch together with our protocol, since (unlike MIDI) VVID != pitch.
This means that a synth cannot reliably handle a new note starting
before the release phase of a previous pitch has ended. It'll just
have to allocate a new voice, completely independent of the old
voice, and that's generally *not* what you want if you're trying to
emulate real instruments.

For example, if you're playing the piano with the sustain pedal down,
hitting the same key repeatedly doesn't really add new strings for
that note, does it...?

With MIDI, this is obvious, since VVID == note pitch. It's not that
easy with our protocol, and I don't think it's a good idea to turn a
vital feature like this into something that synths will have to
implement through arbitrary hacks, based on the PITCH control. (Hacks
that may not work at all, unless the synth is aware of which scale
you're using, BTW.)

> > > No. It means I want the sound on this voice to stop. It implies
> > > the above, too. After a VOICE_OFF, no more events will be sent
> > > for this VVID.
> >
> > That just won't work. You don't want continous pitch and stuff to
> > work except when the note is on?
>
> More or less, yes! If you want sound, you should tell the synth
> that by allocating a VVID for it, and truning it on.

And when you enter the release phase? If have yet to see a MIDI synth
where voices stop responding to pitch bend and other controls after
NoteOff, and although we're talking about *voice* controls here, I
think the same logic applies entirely.

Synths *have* to be able to receive control changes for as long as a
voice could possibly be producing sound, or there is a serious
usability issue.

[...]
> > Another example that demonstrates why this distinction matters
> > would be a polyphonic synth with automatic glisando. (Something
> > you can
> >
> > Starting a new note on a VVID when a previous note is still in
> > the release phase would cause a glisando, while if the VVID has
> > no playing voice, one would be activated and started as needed to
> > play a new note. The sender can't reliably know which action will
> > be taken for each new note, so it really *has* to be left to the
> > synth to decide. And for this, the lifetime of VVIDs/contexts
> > need to span zero or more notes, with no upper limit.
>
> I don't follow you at all - a new note is a new note.

Sure - but where does it belong, logically? The controller or user
might now, but the synth generally doesn't. I'm just suggesting that
senders be able to provide useful information when it's there.

> If your
> instrument has a glissando control, use it. It does the right
> thing.

How? It's obvious for monophonic synths, but then, so many other
things are. Polyphonic synths are more complicated, and I'm rather
certain that the player and/or controller knows better which not
should slide to which when you switch from one chord to another.
Anything else will result in "random glisandos in all directions",
since the synth just doesn't have enough information.

> Each new note gets a new VVID.
>
> Reusing a VVID seems insane to me. It just doesn't jive with
> anything I can comprehend as approaching reality.

MIDI sequencers are reusing "IDs" all the time, since they just don't
have a choice, the way the MIDI protocol is designed. Now, all of a
sudden, this should no longer be *possible*, at all...?

Either way, considering the polyphonic glisando example, VVIDs
provide a dimension of addressing that is not available in MIDI, and
that seems generally useful. Why throw it away for no technical (or
IMHO, logical) reason?

> > > The reason
> > > that VVID_ALLOC is needed at voice_start is because the host
> > > might never have sent a VOICE_OFF. Or maybe we can make it
> > > simpler:
> >
> > If the host/sender doesn't sent VOICE_OFF when needed, it's
> > broken, just like a MIDI sequencer that forgets to stop playing
> > notes when you hit the stop button.
>
> Stop button is different than not sending a note-off. Stop should
> automatically send a note-off to any VVIDs. Or perhaps more
> accurately, it should send a stop-all sound event.

Whatever. A sender should still never leave hanging notes, whatever
it's doing, or whatever protocol is used.

Anyway, the real point is that you may need to talk to the voice
during the release phase; not just until you decide to switch to the
release phase.

[...]
> I'm proposing a very simple model for VVID and voice management.
> One that I think is easy to understand, explain, document, and
> implement.

Sure, that's the goal here.

> It jives with reality and with what users of
> soft-studios expect.

I disagree. I think most people expect controls to respond during the
full duration of a note; not just during an arbitrary part of it.
This is the way MIDI CCs work, and I would think most MIDI synths
handle Poly Pressure that way as well, even though most controllers
cannot generate PP events after a note is stopped, for mechanical
reasons. (It's a bit hard to press a key after releasing it, without
causing a new NoteOn. :-)

> Every active voice is represented by one VVID and vice-versa.
> There are two lifecycles for a voice.

I don't see a good reason to special-case this - and also, there is
no reason to do so as long as a VVID remains valid for as long as you
*need* it, rather than until the "formal end of the note".

> 1) The piano-rolled note:
> a) host sends a VOICE(vvid, VOICE_ON) event
> - synth allocates a voice (real or virtual) or fails
> - synth begins processing the voice
> b) time elapses as per the sequencer
> - host may send multiple voice events for 'vvid'
> c) host sends a VOICE(vvid, VOICE_OFF)
> - synth puts voice in release phase and detaches from 'vvid'
> - host will not send any more events for 'vvid'
> - host may now re-use 'vvid'

c) is what I have a problem with here. Why is the VOICE control
becoming so special again, implying destructive things about the VVID
passed with it and stuff?

> 2) The step-sequenced note:
> a) host sends a VOICE(vvid, VOICE_ON) event
> - synth allocates a voice (real or virtual) or fails
> - synth begins processing the voice
> b) host sends a VOICE(vvid, VOICE_DETACH) event
> - synth handles the voice as normal, but detaches from
> 'vvid' - host will not send any more events for 'vvid'
> - host may now re-use 'vvid'

This completely eliminates the use of voice controls together with
step sequencers. What's the logic reasoning behind that?

My other major problem with this is that it makes step sequencers and
their synths a special class, in that they're using a different
protocol for "voice off". It *might* still make sense to require that
all synths implement a special "unarticulated note off", but I'm not
sure... Sounds like a different discussion, in some way, and the
scary part is that it's still a special case that means senders will
have to treat synths differently.

Given that I generally program my drum kits to respond to NoteOff
anyway, I'm not very motivated to accept step sequencers as something
special enough to motivate special cases in the API, but I can see
why it would be handy for step sequencers not having to worry about
note durations. (The "zero duration note" hack typically used by MIDI
sequencers won't work if the synths/patches use "voice off" to stop
sounds quicker than normal.)

> These are very straight forward and handle all cases I can think
> up. The actual voice allocation is left to the synth. A
> mono-synth will always use the same physical voice. A poly-synth
> will normally allocate a voice from it's pool. A poly-synth under
> voice pressure can either steal a real voice for 'vvid' (and swap
> out the old VVID to a virtual voice), or allocate a virtual voice
> for 'vvid', or fail altogether. A sampler which is playing short
> notes (my favorite hihat example) can EOL a voice when the sample
> is done playing (and ignore further events for the VVID).
>
> It's cute. I like it a lot.

It's just that it can't do things that people expect from every MIDI
synth, just because VVID allocation is integrated with the VOICE
control.

[...]
> > A synth is a state machine, and the events are just what provides
> > it with data and - directly or indirectly - triggers state
> > changes.
>
> And I am advocating that voice on/off state changes be EXPLICITLY
> handled via a VOICE control,

Sure, but how do you suggest we force this upon continous control
synths, without breaking them entirely?

> as well as init and release-latched
> controls be EXPLICITLY handled.

Explicitly telling a synth how to do something that's basically for
the synth author to decide seems a lot more confusing to me than just
not assuming anything about it at all.

> Yeah, it makes for some extra events. I think that the benefit of
> clarity in the model is worth it. We can also optimize the extra
> events away in the case they are not needed.

But when are these extra events needed at all? I still don't see what
information they bring, and what any synth would use it for.

> > As to 1, that's what we're really talking about here. When do you
> > start and stop tracking voice controls?
>
> And how do you identify control events that are intended to be
> init-latched from continuous events?

I'm not sure what you mean, exactly. On the protocol level, it's just
a matter of having the values in place before the "trigger" condition
occurs (normally "note on" or "note off" directly caused by the VOICE
control).

Controllers and other senders will have to know which controls are
init-latched and what triggers the latching of them. There's no way
to avoid that, and it should be covered by the API. The VOICE control
can be standardized, and then we can have hints for "voice on"
latched controls and "voice off" latched controls.

> > Simple: When you get the first control for a "new" VVID, start
> > tracking. When you know there will be no more data for that VVID,
> > or that you just don't care anymore (voice and/or context
> > stolen), stop tracking.
>
> Exactly what I want, but I want it to be more explicit

Sure - that's why I'm sugesting explicit VVID allocation and
detachment events.

> > * Context allocation:
> > // Prepare the synth to receive events for 'my_vvid'
> > send(ALLOC_VVID, my_vvid)
> > // (Control tracking starts here.)
>
> yes - only I am calling it voice allocation - the host is
> allocating a voice in the synth (real or not) and will eventually
> turn it on. I'd bet 99.999% of the time the ALLOC_VVID and
> VOICE_ON are on the same timestamp.

Quite possible, unless it's legal to use VOICE as a continous
control. If it isn't, continous control synth simply won't have a use
for VOICE control input, but will rely entirely on the values of
other controls.

Also, as soon as you want to indicate that a new note is to be played
on the same string as a previous not, or directly take over and
"slide" the voice of a previous note to a new note, you'll need a way
of expressing that. I can't think of a more obvious way of doing that
than just using the same VVID.

> > {
> > * Starting a note:
> > // Set up any latched controls here
> > send(CONTROL, <whatever>, my_vvid, <value>)
> > ...
> > // (Synth updates control values.)
> >
> > // Start the note!
> > send(CONTROL, VOICE, my_vvid, 1)
> > // (Synth latches "on" controls and (re)starts
> > // voice. If control tracking is not done by
> > // real voices, this is when a real voice would
> > // be allocated.)
>
> This jives EXACTLY ives with what I have been saying, though I
> characterized it as:
>
> VOICE_INIT(vvid) -> synth gets a virtual voice, start init-latch
> window VOICE_SETs -> init-latched events
> VOICE_ON(vvid) -> synth (optionally) makes it a real voice (end
> init-window)

Well, then that conflict is resolved - provided synths are note
*required* to take VOICE_ON if they care only about "real" controls.
:-)

> > * Stopping a note:
> > send(CONTROL, <whatever>, my_vvid, <value>)
> > ...
> > // (Synth updates control values.)
> >
> > // Stop the note!
> > send(CONTROL, VOICE, my_vvid, 0)
> > // (Synth latches "off" controls and enters the
> > // release phase.)
>
> Except how does the synth know that the controls you send are meant
> to be release-latched?

It's hardcoded, or programmed into the patch, depending on synth
implementation. I can't see how this could ever be something that the
sender can decide at run time. There's no need to "tell" the synth
something it already knows, and cannot change.

For example, your average synth will have a VELOCITY and a DAMPING
(or something) control pair, corresponding to MIDI NoteOn and NoteOff
velocity, respectively. You could basically set both right after
allocating a voice/VVID, as the only requirement is that the right
values are in place when they should be latched.

[...]
> > * Context deallocation:
> > // Tell the synth we won't talk any more about 'my_vvid'
> > send(DETACH_VVID, my_vvid)
> > // (Control tracking stops here.)
>
> THIS is what I disagree with. I think VOICE_OFF implicitly does
> this. What does it mean to send controls after a voice is stopped?

It means the voice doesn't hang at a fixed pitch, with the filter
wide open and whatnot, just because you decided to start the release
cycle.

> The ONLY things I can see this for are mono-synths (who can purely
> IGNORE vvid or flag themselves as non-VVID)

I've often found myself missing the advantages a monophonic patches
have WRT note->note interaction, when using polyphonic patches. I
actually think the ability to eliminate this distinction is a major
feature that comes with VVIDs. If you can reuse VVIDs, a poly synth
effectively becomes N monosynths playing the same patch - if you want
it to. If not, just reassign a VVID for each new note.

> and MIDI where you want
> one VVID for each note (so send a VOICE_OFF before you alloc the
> VVID again).

That's just a shortcut, and not really a motivation to be able to
reuse VVIDs.

> > This still contains a logic flaw, though. Continous control
> > synths won't necessarily trigger on the VOICE control changes.
> > Does it make sense to assume that they'll latch latched controls
> > at VOICE control changes anyway? It seems illogical to me, but I
> > can see why it might seem to make sense in some cases...
>
> It makes *enough* sense that the consistency pays off, IM(ns)HO.

Yes, and more importantly, this simplifies the handling of latched
voice controls quite a bit.

Further, is there *really* any sense in using latched controls with
continous control synths? Considering that such controls are usually
for velocity mapping and the like, the cases where it would be of any
use at all in a continous control synth are probably very few, if
there are any at all.

That is, continous control synths can just ignore the VOICE control,
and everything will just work as expected anyway. (Just connect your
VOICE output to the VELOCITY input of the synth, and it'll play, at
least.)

> Welcome back! As I indicated, I am moving this week, so my
> response times may be laggy. I am also trying to shape up some
> (admittedly SIMPLE) docs on the few subjects we've reached
> agreement on so far.

Yeah, I "heard" - I'm looking at your post right now. (No risk I'm
going to comment on it or anything! ;-)

//David Olofson - Programmer, Composer, Open Source Advocate

.- The Return of Audiality! --------------------------------.
| Free/Open Source Audio Engine for use in Games or Studio. |
| RT and off-line synth. Scripting. Sample accurate timing. |
`---------------------------> http://olofson.net/audiality -'
   --- http://olofson.net --- http://www.reologica.se ---


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Wed Jan 15 2003 - 14:17:16 EET