Subject: Re: [linux-audio-user] SAOL/MP4 question
From: John Lazzaro (lazzaro_AT_CS.Berkeley.EDU)
Date: Thu Jul 04 2002 - 03:44:55 EEST
> I didn't realize that its similar
> to the old mod file format, but several steps beyond that as well.
When I'm giving talks to codec people, I tell the story like this:
Let's say you have a ProTools set up in front of you, and you
just did the final mixdown. All of your source materials are
sitting there on the hard drive: MIDI files, the algorithms the
soft-synth and effects plug-ins use, the mixdown fader data,
raw per-track audio for the vocals and other elements that came
in via a microphone.
If all you have as a file format are things like MP3 and its
successors (AC-3, MP3 NBC, MPEG 4 AAC, etc), can you use all
of those "work materials" to make a compressed version of the
final mixdown, that is any smaller than just running the MP3
encoder on the stereo master tracks?
The answer is "no" -- but its a frustrating no. All of those
work format files solve the hardest problems in audio compression.
The "audio scene" is separated for you for free, into little
tracks. Any sound, effect, or composition that can be expressed
in a compact form (via an algorithm) is completed modeled and
parameterized for you, sitting right there in the MIDI file or
in the algorithms the reverb runs.
But yet, you can't really use any of it to make the MP3 smaller.
Basically, a big button on the ProTools screen, that dumps out a
Structured Audio representation of the mix, is the killer compression
app. The "microphone" tracks would be handled by special-purpose
SAOL programs that do single-source compression. Everything else is
just an algorithm. Algorithms that are naturally data heavy (like
samples) you'd want to convert to a physical model, or else do
sample-specific compression on the fly.
Because Structured Audio is normative, and an ISO standard, as
long as the decoder is compliant with the standard, and is
careful to avoid some of the non-normative rough edges in the
standard (like a few of the non-normative opcodes, issues with
machines that aren't using IEEE floating point, etc), the receiver
will really hear what the sender hears.
Of course, a tremendous amount of rethinking and work would need
to happen to make ProTools do this, but note there isn't very
much rocket science involved, if any -- its the sort of work that,
say, all of the engineers who wrote the MIDI SysEx commands for
every MIDI box over the last 20 years did. A huge body of work if
you think about it as a single programming project, but every
product did its own little bit, and as a result a patch editor
can edit 20 years of rack-mounts.
-------------------------------------------------------------------------
John Lazzaro -- Research Specialist -- CS Division -- EECS -- UC Berkeley
lazzaro [at] cs [dot] berkeley [dot] edu www.cs.berkeley.edu/~lazzaro
-------------------------------------------------------------------------
This archive was generated by hypermail 2b28 : Thu Jul 04 2002 - 03:43:41 EEST