Re: [linux-audio-user] SAOL/MP4 question

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-user] SAOL/MP4 question
From: John Lazzaro (lazzaro_AT_CS.Berkeley.EDU)
Date: Thu Jul 04 2002 - 03:44:55 EEST


> I didn't realize that its similar
> to the old mod file format, but several steps beyond that as well.

When I'm giving talks to codec people, I tell the story like this:

   Let's say you have a ProTools set up in front of you, and you
   just did the final mixdown. All of your source materials are
   sitting there on the hard drive: MIDI files, the algorithms the
   soft-synth and effects plug-ins use, the mixdown fader data,
   raw per-track audio for the vocals and other elements that came
   in via a microphone.

   If all you have as a file format are things like MP3 and its
   successors (AC-3, MP3 NBC, MPEG 4 AAC, etc), can you use all
   of those "work materials" to make a compressed version of the
   final mixdown, that is any smaller than just running the MP3
   encoder on the stereo master tracks?

   The answer is "no" -- but its a frustrating no. All of those
   work format files solve the hardest problems in audio compression.
   The "audio scene" is separated for you for free, into little
   tracks. Any sound, effect, or composition that can be expressed
   in a compact form (via an algorithm) is completed modeled and
   parameterized for you, sitting right there in the MIDI file or
   in the algorithms the reverb runs.

   But yet, you can't really use any of it to make the MP3 smaller.

Basically, a big button on the ProTools screen, that dumps out a
Structured Audio representation of the mix, is the killer compression
app. The "microphone" tracks would be handled by special-purpose
SAOL programs that do single-source compression. Everything else is
just an algorithm. Algorithms that are naturally data heavy (like
samples) you'd want to convert to a physical model, or else do
sample-specific compression on the fly.

Because Structured Audio is normative, and an ISO standard, as
long as the decoder is compliant with the standard, and is
careful to avoid some of the non-normative rough edges in the
standard (like a few of the non-normative opcodes, issues with
machines that aren't using IEEE floating point, etc), the receiver
will really hear what the sender hears.

Of course, a tremendous amount of rethinking and work would need
to happen to make ProTools do this, but note there isn't very
much rocket science involved, if any -- its the sort of work that,
say, all of the engineers who wrote the MIDI SysEx commands for
every MIDI box over the last 20 years did. A huge body of work if
you think about it as a single programming project, but every
product did its own little bit, and as a result a patch editor
can edit 20 years of rack-mounts.

-------------------------------------------------------------------------
John Lazzaro -- Research Specialist -- CS Division -- EECS -- UC Berkeley
lazzaro [at] cs [dot] berkeley [dot] edu www.cs.berkeley.edu/~lazzaro
-------------------------------------------------------------------------


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Thu Jul 04 2002 - 03:43:41 EEST