[gst-devel] mad decoder plugin : only 16bit audio, and no dither

Sun Jun 27 05:47:07 CEST 2004

Hi people,

In my quest for a high-quality audio  player, I came across a few
interesting softwares using gstreamer for the actual decoding and piping
to soundcard.
It seems like a good way to centralize a big part of the coding effort.

But then I investigated inside gstreamer a bit, to see for sure how the
audio data was handled. I found that :

1. 16bit audio raw data seems to be the norm.

It wouldn't be surprising in video apps, who can get short on bandwidth
and not care too much about sound quality. But I'm expecting my audio
apps to handle the samples with extra bits (so that it can then mix
several channels into one, apply FX, etc.., without adding degradations
at each and every step)
For instance, the jack audio connection kit works with 32bit precision
at every step. It makes things simple, and allows it to deliver optimal
precision to the soundcard.

Looking at the mail archives, gstreamer provides a float audio type
(maybe it's not as straight forward for app developpers to use as int
type..)
Anyway that's those developper's responsability, not gstreamer's.

But then gstreamer's mad decoder plugin also forces 16 bit output.
Why is that ?
The mad decoder library takes great care to decode the mp3 bitstream
with 32 bits precision, that's a waste not to use it.

2. no dither
looking at the mad decoder plugin, the 32 bits samples are simply
rounded into 16bit ints.
That's simply a mistake, from a signal theory point of view.
Reduction of bitdepth should be done only with a proper dither applied
(to make the quantization linear, in average).
The scale function in gstmad.c should include a dither step (just as the
minimad.c MAD usage sample from where it was taken suggests) 
There are free C implementations of good random number generators (eg
mersenne twisters) around, so the only difficulty here is to understand
what dither is, and what amplitude of noise should be added.
(I can provide a simple dither.c with mersenne-twister and ready-to-use
dither function if needed).

In fact, ideally the mad plugin shouldnt have to requantize at all (it
should output the full 32 bit of the decoded samples, carried along by
gstreamer's stuff, down to the final destination sink. And there, the
signal would be dithered and requantized according to the soundcard's
desired bitdepth).

But then, are the other audio gstreamer components adding dither when
they should ?
dither is required :
- anytime a bitdepth is changed, up or down. (eg. if you make a 16bit
stream out of an 8bit one, you should add a triangular noise with
peak-to-peak amplitude = 10bits)
- after resampling (the right way to resample is to compute the fft's
with more precision than the signal, then add dither and requantize)

dither is an important part of digital audio, that's a fact.
This is my first contact with gstreamer, so I might be speaking
nonsense, but at first glance, gstreamer's handling of audio seems a bit
gauche.
-- 
Samuel