[linux-audio-dev] for the LAD archives ...

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: [linux-audio-dev] for the LAD archives ...
From: John Lazzaro (lazzaro_AT_CS.Berkeley.EDU)
Date: Tue Jun 13 2000 - 01:30:16 EEST


Hi everyone,

        Jörn Nettingsmeier requested that I post this to the
list, so that he can link it into the LAD website via the
archives. It's an email message I sent to him in response to
questions he had about Structured Audio and sfront.

                                                        --jl

--------------------------------------------------------------
--------------------------------------------------------------

[Jörn Nettingsmeier]

> could you please answer the list of questions below ?

[john lazzaro]

sure, here are the answers:

> * what is the latest version ?

sfront 0.61, released 05/28/00

> * any new features/design changes ?

Here's the list of feature/updates from the
short version of the sfront 0.61 announcement:

--

Announcing sfront version 0.61 5/28/00, a program that compiles MPEG 4 Structured Audio (MP4-SA) bitstreams into efficient C programs that generate audio when executed.

The major addition to 0.61 is Slib, a SAOL library to simplify low-level programming. Slib is described in a new chapter of the MP4-SA book.

In addition, sfront 0.61 includes tableread() optimizations, as well as fixes for bugs that involve: the dynamic instr command, the outbus command, the buzz opcode and wavetable generator, specialop expressions, opcode table initializations that use opcode parameters, send statements that target the output_bus, and user-defined opcode rate warnings.

--

The freshmeat page for sfront has similar blurbs for the major releases of sfront, see:

http://freshmeat.net/appindex/1999/09/07/936725079.html

and click on each announcement. More detailed change logs are also available on the sfront website, at:

http://www.cs.berkeley.edu/~lazzaro/sa/sfman/user/ref/index.html#change

(keep scrolling down to see earlier and earlier change logs ...).

> * is there a stable and end-user ready version ?

Yes, there's a group of composers & musicians that use it, both as a real-time and streaming decoder under Linux (the use most relevent to LAD) and as a batch processing tool. Here's some download stats from the Berkeley website to give a sense of the size of the userbase, although the "sfront binaries" column here should be degraded about 20-30% to take into account multiple GETs for the file, spiders, ect.

sfront main home book home all page binaries page page views

mar99 66 103 0 103 apr99 17 79 0 79 may99 20 127 14 264 jun99 16 92 8 145 jul99 20 257 73 952 aug99 104 495 195 2459 sep99 686 1186 299 4644 oct99 171 459 174 2169 nov99 777 1256 334 5112 dec99 480 959 292 4236 jan00 976 1630 480 7205 feb00 1123 1918 560 8748 mar00 796 1209 438 6396 apr00 681 1376 425 5793 ----- ----- ---- ----- total 5933 11146 3292 48305

> * did you incorporate any concepts discussed on LAD ?

Not yet, but I hope to incorporate some of the ideas Paul and Benno brought up in the thread over the weekend.

> * what is the latest URL ? (some are already updated, just to make > sure...)

http://www.cs.berkeley.edu/~lazzaro/sa/

> * do you wish to be listed with email address ? if yes, which ? (so > far, no addresses are given)

lazzaro_AT_cs.berkeley.edu

> > i will use this information to overhaul the links page. > if you like, you can write your own project description, which will > be quoted, provided it does not get much longer than 4 lines.

How's this:

Sfront compiles MPEG 4 Structured Audio (MP4-SA) bitstreams into efficient C programs that generate audio when executed. MP4-SA is a standard for normative algorithmic sound, that combines an audio signal processing language (SAOL) with score languages (SASL, and the legacy MIDI File Format). Under Linux, sfront supports real-time, low-latency audio input/output and MIDI input from soundcards. The website includes an online book about MP4-SA

Feel free to edit if its too long ...

> the things i do not understand:

> is mp4 actually a MPEG standard or is this just a pun with mp3 ? > if yes, is it based on mp3 ? my guess is not, i understand it's just > the method of generating sound data on the fly, kind of > resynthesis... correct ?

Structured Audio is an authentic part of MPEG 4 -- basically, think of MPEG 4 as a tree, with the top layer like this

MPEG 4

1. Video (all sorts of video codecs) 2. Audio (all sorts of audio codecs) 3. Systems (combining video and audio into "bits on a wire").

There's an big international body that gets together every few months, and little by little works out the standards for all three parts. As parts get "done", they end up as ISO/IEC standards. The audio sub-tree looks like this (this is a bit out of date, a few other codecs now exist too):

MPEG 4

2. Audio

A. Main. This is the successor to MPEG 2 audio, i.e. MP3. (AAC) Like MP3, the idea is it takes any sort of high-quality audio, compresses it psychoacoustically, and creates a bitstream of data. Usually called by one of its primary branches, AAC (Advanced Audio Coding).

B. Parametric Coding C. CELP D. Time/Frequency Coding

These three are like AAC, in that its audio data in, compressed audio data out, but the codecs are specialized for certain kinds of data (for example, CELP is a speech codec) or are specialized to produce really low-bit-rate audio without sounding "all that bad" (unlike AAC, which is optimized for "really good sounding audio, as small as we know how to do").

F. Text-To-Speech

This is a "synthetic" part -- this means the input isn't an uncompressed audio data stream, but something else. In this case, the input to the encoder is text, along with prosody (how the voice is supposed to change pitch), and this gets encoded into the bitstream so that the decoder can convert it into synthetic speech.

E. Structured Audio

This is what sfront handles. Structured Audio is also synthetic, in that the input to the decoder isn't uncompressed audio data. Instead, its a computer program in a general-purpose language (SAOL) that has lots of semantic features to make audio processing easy to express, as well as a library. A Structured Audio "encoder" merely takes the program, along with "data" that you can think of as a score language (two are supported, MIDI and a more powerful language SASL) or as waveform data, and encodes the program and data into a binary format. The "decoder" executes the program.

Sfront can do both encoding and decoding. The decoder can take as input not only any SASL or MIDI score data packed into the MP4 file, but "real-time" input from microphones and MIDI controllers (or SASL controllers, if such things ever exist). Thus the link with real-time LAD stuff. Part of the idea of supporting real-time modes is that you can "record" a live playing session as the SASL or MIDI data inputted into the decoder. You can then pack up that data along with the SAOL program, and voila! -- potentially very high compression ratios -- much higher than taking the audio and compressing it with MPEG 4 AAC or MP3.

Structured Audio has made it to "FDIS" status in the standardization process -- that's Final Draft of International Standard (to be specific, Structured Audio is ISO/IEC 14496-3 section 5). What's left to do is Corrigendas (i.e. bugfixes) and Compliance (i.e. how to verify that an SA decoder is "normatively compliant" -- i.e. two different decoders that are both compliant will produce audio that essentially "sounds the same", and so it really can be thought of as a compression scheme just like a natural decoder like MPEG 4 AAC).

BTW, I'm not an MPEG "member", I didn't have anything to do with the official standards process -- Eric Schierer of the MIT Media Lab is the main author of SA.

> should i list sfront or the mp4 method ?

One thing you could imagine doing is under the "Miscellaneous Resources" do:

MPEG-4 Structured Audio: Developer Tools. An online book about the audio signal processing language that is a part of the MPEG 4 standard.

And then under "Miscellaneous Software by LAD Contributors" have this blurb or a shortened version:

Sfront compiles MPEG 4 Structured Audio (MP4-SA) bitstreams into efficient C programs that generate audio when executed. MP4-SA is a standard for normative algorithmic sound, that combines an audio signal processing language (SAOL) with score languages (SASL, and the legacy MIDI File Format). Under Linux, sfront supports real-time, low-latency audio input/output and MIDI input from soundcards.

Probably both should use the same link,

http://www.cs.berkeley.edu/~lazzaro/sa/

since this page serves as the entrance page for both.

> all the best, and thanks in advance for any enlightenment !

Happy to help -- basically, its really hard to explain Structured Audio simply, because its so many things to so many people. For example, one thing you can do with Structured Audio is write your own MP3 decoder _in_ Structured Audio, and pack the data for a song in the SASL score. Or for that matter, any other decoding technique that might be better than MP3, which can be a customized algorithm for the song. There are bunches of apps like this, each of which seemingly needs its own explanation of Structured Audio to make sense (to take another example, the interactive part of SA means you can use it to do long-distance audiology testing).

--jl

------------------------------------------------------------------------- John Lazzaro -- Research Specialist -- CS Division -- EECS -- UC Berkeley lazzaro [at] cs [dot] berkeley [dot] edu www.cs.berkeley.edu/~lazzaro -------------------------------------------------------------------------


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Tue Jun 13 2000 - 02:12:04 EEST