Re: [linux-audio-dev] Catching up with XAP

New Message	Reply	About this list	Date view	Thread view	Subject view	Author view	Other groups

Subject: Re: [linux-audio-dev] Catching up with XAP
From: David Olofson (david_AT_olofson.net)
Date: Thu Jan 16 2003 - 16:17:55 EET

Next message: janne halttunen: "Re: [linux-audio-dev] Additional LADSPA hints"
Previous message: Steve Harris: "Re: [linux-audio-dev] Additional LADSPA hints"
In reply to: Tim Hockin: "Re: [linux-audio-dev] Catching up with XAP"
Next in thread: Tim Hockin: "[linux-audio-dev] XAP - init/release latched controls and protocol"
Reply: David Olofson: "Re: [linux-audio-dev] Catching up with XAP"

On Thursday 16 January 2003 01.14, Tim Hockin wrote:
[...]
> > The only problem I have with it is that it's completely
> > irrelevant to continous control synths - but they can just ignore
> > it, or not have the control at all.
>
> Does continuous control mean continuous sound?

No. A synth has to be able to shut up, I think. If the VOICE control
is compulsory, it might make sense for continous control synths to
use it as a gate, but "normal operation" of such synths would make
use of continous controls only.

> > > This is not at all what I see as intuitive. VOICE is a
> > > separate control used ONLY for voice control. Instruments have
> > > it. Effects do not.
> >
> > There's this distinct FX vs instrument separation again. What is
> > the actual motivation for enforcing that these are kept totally
> > separate?
>
> They are not totally separate, but VOICE control is something
> unique to Instruments.

Yeah... Something that has VOICE controls is a synth?

But what about continous control synths? If VOICE isn't continous,
it's of no use to such synths, unless we require that they implement
it in *some* way, whether it makes sense or not.

Anyway, I don't see why this matters, really, as a plugin is just a
synth because it is by some definition - not because it's using a
different API. We could call anything that makes use of a particular
API feature a synth, but it doesn't seem like the VOICE control, or
VVIDs could be that feature.

> > > > What might be confusing things is that I don's consider
> > > > "voice" and "context" equivalent - and VVIDs refer to
> > > > *contexts* rather
> > >
> > > I disagree - a VVID refers to a voice at some point in time. A
> > > context can not be re-used. Once a voice is stopped and the
> > > release has ended, that VVID has expired.
> >
> > Why? Is there a good reason why a synth must not be allowed to
> > function like the good old SID envelope generator, which can be
> > switched on and off as desired?
>
> They still can - it's just the responsibility of the synth to know
> this, and not the sequencer or user.

I'm not talking about responsibility, but the ability say when you
want to make use of such features, if they exist. You can't do that
without either VVIDs, or some explicit extra feature.

> > Also, remember that there is nothing binding two notes at the
> > same pitch together with our protocol, since (unlike MIDI) VVID
> > != pitch. This means that a synth cannot reliably handle a new
> > note starting before the release phase of a previous pitch has
> > ended. It'll just have to allocate a new voice, completely
> > independent of the old voice, and that's generally *not* what you
> > want if you're trying to emulate real instruments.
>
> ack! So now you WANT old MIDI-isms? For new instruments (which do
> not want to feel like their MIDI brethren) this is EXACTLY what we
> want. For instruments which are supposed to behave like old MIDI
> synths, that is the responsibility of the synth to handle, NOT the
> API or sequencer or user.

MIDI doesn't get this right; it only works for mono synths, and only
for notes at the same pitch for poly synths. In MIDI, note->note
relations are implicit and "useful" only by luck, basically.

What I'm suggesting is not this same brokenness, but a way to
*explicitly* say if you intend notes to be related or not.

> > For example, if you're playing the piano with the sustain pedal
> > down, hitting the same key repeatedly doesn't really add new
> > strings for that note, does it...?
>
> But should we force that on the API?

It's not forced, really. You can say that one note is somehow related
to a previous not, or that it's a completely new note - irregardless
of arbitrary controls, such as PITCH.

> No, we should force that on
> the synth.

You *could* do it for this particular case by looking at PITCH, but
that works only when the note->note relation is implied by "same
PITCH", and breaks down if you try to use adaptive scales (like 12t
with "dynamic" temperament) or anything like that.

[...]
> > > If your
> > > instrument has a glissando control, use it. It does the right
> > > thing.
> >
> > How? It's obvious for monophonic synths, but then, so many other
> > things are. Polyphonic synths are more complicated, and I'm
> > rather certain that the player and/or controller knows better
> > which not should slide to which when you switch from one chord to
> > another. Anything else will result in "random glisandos in all
> > directions", since the synth just doesn't have enough
> > information.
>
> Unless I am musically mistaken, a glissando is not a slide. If you
> want to do chord slides, you should program it as such.

Yes, you're right. I'm thinking about portamento. (Glissando means
you mark each scale tone or semitone, depending on instrument. It's
never a note->note *slide*, right?)

> > > Reusing a VVID seems insane to me. It just doesn't jive with
> > > anything I can comprehend as approaching reality.
> >
> > MIDI sequencers are reusing "IDs" all the time, since they just
> > don't have a choice, the way the MIDI protocol is designed. Now,
> > all of a sudden, this should no longer be *possible*, at all...?
>
> I don't see what it ACHIEVES besides complexity. This is twice in
> one email you tout MIDI brokenness as a feature we need to have.
> You're scaring me!

Again, MIDI doesn't do this right. I'm just using it as an example,
since *some* MIDI synths try to make the best of this "accidental"
feature of MIDI.

Either way, I don't see what this breaks. The ability to reuse VVIDs,
and thereby express note->note relations *adds* functionality without
breaking anything. If you just never play more than one note per
VVID, everything works exactly as you're proposing it always should.

> > Either way, considering the polyphonic glisando example, VVIDs
> > provide a dimension of addressing that is not available in MIDI,
> > and that seems generally useful. Why throw it away for no
> > technical (or IMHO, logical) reason?
>
> I had a sleep on it, and I am contemplating adjusting my thought
> processes, but your glissando example is not helping :)

Make it portamento, and I think it's more obvious what I mean.

[...]
> > I disagree. I think most people expect controls to respond during
> > the full duration of a note; not just during an arbitrary part of
> > it. This is the way MIDI CCs work, and I would think most MIDI
> > synths handle Poly Pressure that way as well, even though most
> > controllers cannot generate PP events after a note is stopped,
> > for mechanical reasons. (It's a bit hard to press a key after
> > releasing it, without causing a new NoteOn. :-)
>
> I think my proposal FULLY maps to MIDI.

No, because MIDI *can* handle Poly Pressure after NoteOff. It's just
that most controllers can't do that for physical reasons, and it's
entirely possible that most synths don't implement it.

> That said, I can accept
> that you want to send voice controls during the release phase.

Fine. BTW, it might also be interesting to point out that as we have
*real* Voice Controls, there are many cases where we can do things in
one Channel that requires multiple Channels with MIDI. Pitch bend
would be the most common example - and that's a normal CC, which is
not restricted to notes in any way.

> > > 2) The step-sequenced note:
> >
> > This completely eliminates the use of voice controls together
> > with step sequencers. What's the logic reasoning behind that?
>
> That is how step-sequencers work, no? You turn a note on and go
> about your business. The only control events are init events.

Well, it doesn't make sense to me that percussion synths and the like
should not be allowed to implement continous voice controls. Maybe
you never want to do that with a step sequencer - but then, just
don't! You don't need special API support for that.

> > > It's cute. I like it a lot.
> >
> > It's just that it can't do things that people expect from every
> > MIDI synth, just because VVID allocation is integrated with the
> > VOICE control.
>
> I don't buy this. MIDI pitch bend is a channel control, it would
> still work.

And with MIDI, you have to use multiple Channels to control notes
independently. I've heard countless complaints about how useless MIDI
is in this regard, and I've sworn over it myself. Should we do it the
same way again...?

> Anything you can do to a MIDI synth you can do in this
> model.

Short of controlling Poly Pressure after VOICE_OFF - but I'm not sure
if that's really a feature of MIDI, or just a side effect. (Just like
this multiple NoteOn thing - and synths handle it differently.)

> It leaves exact emulation of MIDI brokenness to the synth,
> where it belongs.

I don't see what the idea that one note has a relation to a previous
note has to do with MIDI brokenness, really. MIDI *is* indeed broken
in this regard - and that's because it does *not* fully implement
what I'm suggesting.

> > Quite possible, unless it's legal to use VOICE as a continous
> > control. If it isn't, continous control synth simply won't have a
> > use for VOICE control input, but will rely entirely on the values
> > of other controls.
>
> Voice is not a continuous control - it says "Go" and "Stop". That
> is all.

Ok. Then continous control synths will have to ignore it.

> I see four main classes of instruments:
>
> 1) always-on - this includes noise generators (record-player emu)
> and monitors (line-in is always on). The instruments would not
> have a VOICE control. They are just always on (maybe they have a
> mute button - not an explicit part of the API). They do not need
> VVIDs at all. They do not have voice controls.

These don't really look like synths to me, unless the definition is
simply "something that generates audio" - and in that case, a synth
must not take audio input, or it's not a synth! :-) Oh well...

> 2) continuous control - this includes things like a violin, which
> receives streams of parameters. This gets a VVID for each new
> voice. If you want glissando, you would tell the violin synth that
> fact, and it would handle it.

But how about portamento?

Well, in this particular case, going the MIDI way (one Channel per
string) is probably a good idea, since the strings sound different
anyway. However, this does not necessarily apply to a synth with
glisando.

And BTW, no, you can't really play a polyphonic synth with portamento
on a keyboard without some "smart" transformer in between. (Many
synths do this, but the result is kind of random.) If that looks like
a reason to dismiss note->note relations as useless, keep in mind
that there *are* other controllers than keyboards, and that lots of
people create music by other means than recording real time into a
sequencer.

> 3) synths - they probably latch some controls at init, and have
> some continuous controls. They get a VVID for each new voice.
>
> 4) drum-machines - this is really just a special case of #3.
> Synths with short holds are the same. They get a new VVID for
> every voice.

Speaking of which; I assume that drum machines, by your definition,
generally ignore note duration. What happens if you drive a different
kind of synth from a step sequencer, which is meant to drive drum
machines?

> In all cases except #1, the sequencer loads up all the
> init-parameters and says 'Go'.
>
> > Also, as soon as you want to indicate that a new note is to be
> > played on the same string as a previous not, or directly take
> > over and "slide" the voice of a previous note to a new note,
> > you'll need a way of expressing that. I can't think of a more
> > obvious way of doing that than just using the same VVID.
>
> I'd rather see us codify a slide model. What I'm not grasping is
> HOW does the host know whether to re-use a VVID or not?

It knows when the user somehow indicates that notes belong to the
same context. It's similar to using multiple Channels in that
respect, but it avoids "manual" allocation, and it allows the synth
to handle both chained and independent notes with the same patch, on
the same Channel.

> It also
> means that synths need to handle both cases.

Yes, but how big a difference does it make, really? If you really
don't want to do *anything* special with chained notes, just do this
when you get NOTE(1):

        * Allocate a new voice.
        * Copy the context of the current voice to the new voice.
        * Detach the current voice from the VVID.
        * Attach the new voice to the VVID.

* (Proceed as usual.)

> > I've often found myself missing the advantages a monophonic
> > patches have WRT note->note interaction, when using polyphonic
> > patches. I actually think the ability to eliminate this
> > distinction is a major feature that comes with VVIDs. If you can
> > reuse VVIDs, a poly synth effectively becomes N monosynths
> > playing the same patch - if you want it to. If not, just reassign
> > a VVID for each new note.
>
> But how does a synth request this behavior?

It doesn't. You just tell it when notes are related by reusing the
VVID. The synth may or may not handle notes differently, but the
point is that the sender can tell the synth what's intended, and the
synth *can* do the right thing if it cares.

> I don't see why this
> is advantageous, or how it could be anything but confusing.

I think the way poly synths try to guess what you mean when switching
from one chord to another is confusing, and effectively useless for
anything but a few minutes of fun. And the "real" way of doing it,
with multiple Channels is just a major PITA, as well as a waste of
resources. (The latter is no major issue though - if you have
unlimited Channels...)

> > Further, is there *really* any sense in using latched controls
> > with continous control synths? Considering that such controls are
> > usually for velocity mapping and the like, the cases where it
> > would be of any use at all in a continous control synth are
> > probably very few, if there are any at all.
>
> Would you please explain to me what you mean by a continuous
> control synth? Maybe I am just Not Getting It? I had been assuming
> that a continuous control synth was simply a synth that had all
> it's voice controls dynamically updateable. Am I missing
> something?

That's what I mean, basically. It's just that a *fully* continous
control synth won't even care about the VOICE switch, since it is the
(other) voice controls that determine when sound is to be produced.

As to init controls, consider the way you would control such a synth
in real time. A traditional keyboard is certainly more or less
useless for this. A tablet would work, though (rather well too -
tried it :-), so I'm using that as an example. What a tablet gives
you is just a stream of continous control data. There is no start and
end; just control data. (Yep, tablet/tool distance is continous as
well, although it generally doesn't have many bits of resolution or
much of a range.) Controlling a synth with that means you have to
base synthesis entirely on this control data, and there isn't even
anywhere to get any "note timing" or init control values. Those
concepts just don't apply, so if the synth relies on them, you'll
have to fake them somehow.

> *** From: Steve Harris <S.W.Harris_AT_ecs.soton.ac.uk>
>
> > > This is not at all what I see as intuitive. VOICE is a
> > > separate control used ONLY for voice control. Instruments have
> > > it. Effects do not.
> >
> > I would say only polyphonic instruments have VOICE control.
> > Modualar synths are not polyphonic (at the module level).
>
> Mono-synths need a VOICE, too. Modular synth modules might not.
> They are effectively ON all the time, just silent because of some
> other control. This works for modular-style synths, because they
> are essentially pure oscillators, right? Mono-synths can still
> have init-latched values, and releases etc. Does it hurt anything
> to give modular synths a VOICE control, if it buys consistency at
> the higher levels? Maybe - I can be swayed on that. But a modular
> synth is NOT the same as just any mono-synth.

Good point.

Dunno if VOICE could *hurt*, really, but I think it's essential that
synths are allowed to ignore it when it's irrelevant.

> *** From: David Olofson <david_AT_olofson.net>
>
> > If you're *real* lazy, you can just treat new notes while the
> > voice is playing by restarting the envelopes. Given that you
> > handle pitch in the usual way (ie as continous pitch - like MIDI
> > pitch + pitch bend, basically), you get primitive legato when
> > reusing VVIDs. Pretty much a free bonus feature.
>
> Re-using VVIDs for slides is the first thing I have heard that
> makes me think it might be sane. It is sane to me because the user
> decides he wants a slide, not because the synth wants to feel like
> it's MIDI counterpart.

And this is *exactly* what I'm thinking about.

Forget about the MIDI stuff :-) It was just a failed attempt at
pointing out that MIDI synths actually *do* have this feature to some
extent, and that it's useful even there, despite the brokenness.

> > The common logic here is that the current state of the context
> > effectively becomes parameters for the new note - and this
> > applies to both mono and poly synths.
>
> OK - this I'll buy IN THE CONTEXT OF A SLIDE.

There are more uses, though. Just a few ideas for what synths might
do:

* New note: Start playing a waveform.
Chained: Slide pitch, restart env, keep playing

        * New note: Start playing a waveform.
          Always: Track the phase of the waveform.
          Chained: Start playing at the current phase.

* New note: "Smooth start"; slow attack etc.
Chained: "Running start"; various transition FX.

* New note: Latch pitch as "base_pitch" and play note.
Chained: Play note, ring modulating with base_pitch.

Note that the synth has access to the *full* context of the previous
note, including information that the sender can't have any idea
about. (Such as waveform phase or state of random number generators.)
Even if the sender had all this information, implementing and
controlling this kind of stuff by sending extra control data to the
synth would be a nightmare.

That said, you *could* have this special version of ALLOC_VVID that
takes an old VVID as an argument or something like that, but I
*really* don't see how that would be any better than just reusing the
old VVID. It would have worked for MIDI, sort of, as a "VVID" there
is just a 7 bit not pitch, but a VVID is more like an actual object,
and there's no need for a kludge like this anyway.

> *** From: David Olofson <david_AT_olofson.net>
>
> > [...polyphony, glisando, reusing VVIDs etc...]
> >
> > > One thing you can't express with (polyhonic) MIDI is the
> > > following
> > >
> > > : Suppose you have a note A playing, and you want to start
> > > : another
> > >
> > > one (B) 'in the context of A'. What this means is that B will
> > > probably (but not necessarily) terminate A, and that the attack
> > > phase of B will be influenced by the fact that it 'replaces' A.
> > > This may be a glissando, but it could be something much more
> > > complicated than that.
>
> This is actually a good explanation, and one that makes sense to me
> (Sorry David :) I'm still not sure how it would be used, or what
> makes it valuable.

See above.

> *** From: David Olofson <david_AT_olofson.net>
>
> > > > I don't follow you at all - a new note is a new note. If
> > > > your instrument has a glissando control, use it. It does the
> > > > right thing. Each new note gets a new VVID.
> > >
> > > I agree with Tim about this.
> >
> > What I'm not getting is how this is supposed to work with
> > polyphonic synths. Or isn't it, just because it doesn't really
> > work with MIDI synths...?
>
> I don't follow you at all - what DOESN'T work?

Well, think portamento, and it might be less confusing... :-)

What doesn't work is the synth figuring out which notes should slide
where. Without explicit note->note relations (expressed as reused
VVIDs, or by other means), the synth just sees two chords, and is
supposed to figure out figure out the relation between them. Not only
is there no single correct relation (the user might want something
different than the "first choice" based on traditional theory); the
notes of each chord will generally not arrive at the same time.
(Unless you're doing 100% quantization, of course.)

> > Why enforce that notes on polyphonic synths are completely
> > unrelated, when they're not on monophonic synths, and don't have
> > to be on polyphonic synths either? The *only* reason why MIDI
> > can't do this is that it abuses note pitch for VVID, and we don't
> > have that restriction - unless we for some reason decide it's
> > just not allowed to make use of this fact.
>
> This does not parse well. The whole point of polyphonic synths (in
> my mind) is that every note is completely unrelated to every other
> note.

I disagree. It's not even completely true for common things such as
pianos. The relation that a note depends on the previous note of the
same pitch, or even the full history of notes on that pith, is really
rather common. Relations between notes at different pitches exist in
synths as well, but isn't all that common, part because MIDI and
keyboard controllers cannot handle it, so synths have to guess what's
intended.

> You can hit middle-C 10 times and have it play ten voices of
> middle-C.

Yes you can with some synths, but it's nearly always incorrect if
it's supposed to be a simulation of any real instrument.

> To say that a mono-synth has related voices is inane.

Oh, the single voice is not related to itself...?

What happens if you start a new note while the previous one is still
in the release phase?

> It has one voice. When you want a new voice, you have to stop the
> old one.

Of course. But how do you tell a mono synth to do that? Special "I
want you to handle the next note differently" event, or what?

> Now, I am coming around that it is not so insane to want to
> re-trigger a voice, using the existing state of controls. HOWEVER,
> I see this more as a shortcut to actually sending the control
> values for a new voice.

It's more than that. You can't send the synth information that you
don't have (oscillator phase, for example), and the very idea of
telling the synth things about a previous context that belongs to the
*synth* seems totally backwards. Besides, you generally don't even
care about the details. Which parts of the old context the synth
makes use of depneds on the synth and patch, and that's more private
stuff that senders cannot and shouldn't mess with.

As I suggested above, you could tell the synth to use a previous
context by sending a control referring to the VVID of a previous not,
but what's the point in doing it that way?

> I can't see where it is actually useful
> beyond what I have described (and the bits I have agreed to
> change).
>
>
> Summary:
>
> This is where I am on this topic now:
>
> Two basic kinds of instruments: Always-On and Voiced.
>
> Always-On instruments (e.g. a line-in monitor) do not have a VOICE
> control. The host must always gather data from these for every
> processing cycle. They do not deal with VVIDs at all.
>
> Voiced instruments (e.g. a mono-synth or a poly-synth) have a VOICE
> control. They can only make sound in response to a VOICE_ON event.
> They deal with VVIDs.

If you can explain where to connect VOICE in a full continous control
synth, and why synth is not allowed to produce sound without VOICE ==
1, I have no problems with this.

> A typical voice cycle:

I was thinking that the idea was that "init" and "exit" should be
matched - so where did VVID_ALLOCATE go? :-)

No, it's not required, but without it, you'll need to check if the
VVID is initialized whenever you get a Voice Control. With
VVID_ALLOCATE, there's not even a need to be able to tell whether
VVIDs are initialized or not, as you'll never actually operate on one
that isn't. (Unless you have a broken sender, that is. And a broken
sender could live in the same address space as you... :-)

I'm in favor of eliminating redundant events, of course, but I have a
feeling that senders sending an extra event before (most) notes hurts
less than every poly synth double checking VVIDs all the time.

(And implicit VVID initialization was actually my idea, IIRC. I just
can't decide, can I!? ;-)

> * send a series of 0 or more control events for VVID 'v'
> - the new voice is allocated and controls tracked
> * VOICE_ON for VVID 'v'
> - the voice can now make noise

What prevents it from doing so? Is this a compulsory gate control, as
well as a state control?

> * send a series of 0 or more control events for VVID 'v'
> - the controls are actuated
> * VOICE_OFF for VVID 'v'
> - the voice is sent to the release phase

A full continous control synth does not necessarily have states at
all. What to do here?

> * send a series of 0 or more control events for VVID 'v'
> - the controls are tracked
> - the voice can end

Yes.

> * VVID_RELEASE for VVID 'v'

But it's not really the end of the world of you play another note
before releasing the VVID, is it?

> The best way to discuss this then is a proper voice-state diagram,
> and I hate ASCII drawing :)
>
> I'm not conviced that re-using a context is useful at all.

I'm 100% sure it's useful enough that alternative solutions will be
required if it's not supported. (Multiple Channels, explicit "reuse
this VVID" events, or whatever.)

> But I
> don't see why it should be excluded. I think it adds complexity to
> synths - they now have to handle this.

Yes, but it doesn't have to be many lines of code. (See above.)

Haven't tried it with real code yet, but I'm suspecting that some
synths might even do the right thing without explicitly handling
this. (Depends on how voice stealing is implemented.)

> Look for another email later regarding latched events, which will
> possibly modify all this a bit.
>
> Sorry if this isn't the most coherent email - I lost half of it in
> a stupid mistake, and the rest has been written all throughout the
> day.

Well, at least you're not being totally incomprehensible most of the
time, like me. ;-)

//David Olofson - Programmer, Composer, Open Source Advocate

.- The Return of Audiality! --------------------------------.
| Free/Open Source Audio Engine for use in Games or Studio. |
| RT and off-line synth. Scripting. Sample accurate timing. |
`---------------------------> http://olofson.net/audiality -'
--- http://olofson.net --- http://www.reologica.se ---

Next message: janne halttunen: "Re: [linux-audio-dev] Additional LADSPA hints"
Previous message: Steve Harris: "Re: [linux-audio-dev] Additional LADSPA hints"
In reply to: Tim Hockin: "Re: [linux-audio-dev] Catching up with XAP"
Next in thread: Tim Hockin: "[linux-audio-dev] XAP - init/release latched controls and protocol"
Reply: David Olofson: "Re: [linux-audio-dev] Catching up with XAP"

New Message	Reply	About this list	Date view	Thread view	Subject view	Author view	Other groups

This archive was generated by hypermail 2b28 : Thu Jan 16 2003 - 16:25:08 EET