linux-audio-dev: Re: [LAD] Replaygain for video?

From: Philipp Überbacher <hollunder@email-addr-hidden>
Date: Sat Dec 10 2011 - 22:42:07 EET

Excerpts from Adrian Knoth's message of 2011-12-10 15:01:54 +0100:
> On Sat, Dec 10, 2011 at 01:10:38PM +0100, Philipp Überbacher wrote:
>
> > > 2. Players like VLC have a normalize function. I don't know if it's
> > > one-pass (on the fly, more like automated gain control) or two-pass,
> > > but it basically solves your problem of audio levels at playback
> > > time.
> >
> > Counter argument:
> > If normalization would be sufficient, there wouldn't have been a
> > need for replaygain in the first place. Explanation on
> > http://www.replaygain.org/.
>
> Huh? I cannot find anything on this website that says "normalization is
> not sufficient":
>
> "Although music is encoded to a digital format with a clearly defined
> maximum peak amplitude, and although most recordings are normalized
> to utilize this peak amplitude, not all recordings sound equally loud.
> This is because once this peak amplitude is reached, perceived loudness
> can be further increased through signal-processing techniques such as
> dynamic range compression and equalization"

This page tells you that peak normalization is not sufficient, but
that's nothing new, it's logical that the highest peak has basically no
relation to the perceived loudness of a piece of audio.

> I have to admit I'm not too familiar with the details of replaygain, but
> I'm not aware that it actually does compression or equalization. In
> contrast, they explicitly state:
>
> "The player reads the corresponding gain metadata value from the file
> and scales the audio data as appropriate. Scaling the audio data simply
> means multiplying each sample value by a constant value."
>
> That's gain, nothing more, nothing less.

Exactly, no equalisation, no compression, that's why purists like it.

> According to the wiki, they store four values:
>
> 1. Peak track amplitude
> 2. Peak album amplitude
> 3. Track replay gain
> 4. Album replay gain
>
> We can directly forget about the album values, since we're talking
> videos here.

I agree, album mode makes little sense unless we have music video albums
or similar.

> Let's assume we're brave and normalize to 0dBFS, then this is obviously
> the resulting peak track amplitude.
>
> Asking for additional replay gain would simply cause distortion.
>
> Long story short: only the peak track amplitude could be a useful
> information if you don't want to apply automatic gain control or
> read the entire file every time you play it just to determine the
> correct gain for normalization.

Wrong. This is in short how (track) replaygain works:
1) Calculate the tracks loudness. Loudness implies psychoacoustics. It
uses some algorithm that takes the human perception into account.
2) Compare the measurement value to a reference (pink noise at -14 dB)
and store the difference as metadata.
3) The audio player reads the metadata and adjusts its output level
according to the metadata.

The result: levels of 'loud' songs get lowered to the reference level,
levels of 'soft' songs get raised to the reference level. No
compression, now eq, no need to twist the volume knob all the time.
Again, the reference level is below 0 dBFS, loud songs get attenuated.
Peak normalization in contrast just tries to make everything as loud as
possible and doesn't take human perception into account at all.

> > > There are special tools. If nothing helps, vlc and mplayer can do this.
> > > mplayer with -dumpvideo and -dumpaudio, vlc with the transcode commands.
> > Right, this should do. Is there an API for that?
>
> Sure, libavformat from ffmpeg or libav, whatever you prefer.

Ok, thanks.

> > > > After that, the scanning process should work as with any audio file.
> > > > Afterwards the calculated replaygain values have to be added to the
> > > > metadata of the file.
> > >
> > > Provided that the audio track supports meta data. Depending on the
> > > container, you can embed almost everything from pcm to ogg, mp3, mp4/aac
> > > and so on.
> > >
> > > Depending on the container, it might be possible to add the meta data to
> > > the muxed file.
> > Ah, I didn't think of adding metadata to the audio itself, another
> > possibility. However, adding it to the container would probably be more
> > universal.
>
> It's probably not that simple. If the container doesn't provide the
> possibility to add sane metadata, you'd be lost. Likewise, there might
> be multiple audio streams (stereo, multichannel, different languages).
> You'd need a way to relate to those substreams from within your global
> metadata.

This should be possible, shouldn't it? The player needs to be able to
identify those anyway.

> It could be easier to work on the audio streams directly, but this would
> require re-muxing the file, causing subtle problems like A-V sync or the
> necessity to write arbitrary containers (MPEG, MPEG-TS, MP4, ogv...).
> ffmpeg/libav might help.
>
> Either way, it's complex. ;)

Yes, especially with all those containers and formats out there.
However, I'd start with one where it's reasonably easy to do :)

> [jackd connection handling]
> > Alex Stone, I guess :)
>
> Exactly.
>
> > If you have a good idea, please tell me. I'll have to find a team on
> > monday and it might help to have some good proposals.
>
> I'm still looking for somebody to rewrite hdspmixer. ;) While we're at
> it, how about an HTML based approach: you fire up your browser, either
> have a matrix mixer for everything or can select individual output
> buses, and then have the input/playback faders for this destination
> only. (I have a couple of details, if you want to go this route)

I don't have such a mixer. The html approach is probably outside the
scope of this class, it's a beginner C/C++ class.

> Other idea: a P1722 streamer, no idea if Christoph Kuhr is still working
> on that.

I guess my teacher could like it, but I know basically nothing about
network stuff at this point.

> Or ask Paul if he needs some help with jack3. ;)
>
>
> Cheers

Haven't heard of that one yet and I doubt we need another implementation
:)

Thanks for your help,
regards,
Philipp

_______________________________________________
Linux-audio-dev mailing list
Linux-audio-dev@lists.linuxaudio.org
http://lists.linuxaudio.org/listinfo/linux-audio-dev
Received on Sun Dec 11 00:15:02 2011

This archive was generated by hypermail 2.1.8 : Sun Dec 11 2011 - 00:15:02 EET