[gst-devel] floating point audio/raw

Thu Apr 19 22:05:28 CEST 2001

I was looking around for information on floating-point
audio data and found this little tidbit:
http://developer.apple.com/techpubs/quicktime/qtdevdocs/REF/refSlopeInt.htm
and I thought this would be a better way to represent
float audio/raw data capabilities than defining a max
and min.

Basically the boundry of the float data is defined by
an intercept value and a slope value.  The "intercept"
is analagous to a DC offset in analogue audio signals
- it is the value that the signal "centres" on
(usually 0.0).

The "slope" is how far the signal deviates from the
intercept.  So a slope of 1.0 and an intercept of 0.0
would mean an audio signal with values that range from
-1.0 to 1.0.

Other examples might be:
0.0 - 1.0 (intercept=0.5, slope=0.5)
Spinal Tap scale (intercept=5.5, slope=5.5 :)

It may seem more natural to represent this as a max
and min value, but I think it is more useful to have
the slope and intercept values in hand when actually
programming elements.

Someone on this list suggested that we could just
specify that all audio data should just be bound
between -1.0 and 1.0.  After all, that is what the
LADSPA spec says.  I would like to suggest that we put
no fixed limit on the intercept or slope of
floating-point data.  This can lead to a more
efficient pipeline by avoiding unnecessary scaling.

Consider a pipeline consisting of:
osssrc->int2float->foo->float2int->osssink

during caps nego foo accepts whatever slope and
intercept it is given and does its processing
accordingly.  When int2float converts from int to
float it just does a cast - no scaling.  When
float2int converts from float to int, it knows that no
scaling will be required from caps nego so it just
does a cast and that is the end of it.

Now if foo was a LADSPA plugin that needs an intercept
of 0.0 and slope of 1.0 then it would say so during
caps negotiation and int2float/float2int would scale
as required.

Another minor source of optimisation occurs when an
effect changes the amplitude of the signal as an
unwanted side-effect.  Normally the element would have
to scale the signal back to the desired amplitude
before passing the data on.  In this instance the
element could just adjust the caps to reflect the new
amplitude.  This would mean that actually scaling the
signal would only happen once in a long chain at the
float2int element.

My current caps factory for the int2float plugin looks
like this:
static GstPadTemplate* 
int2float_src_factory (void)
{
  return 
    gst_padtemplate_new (
    "src",
    GST_PAD_SRC,
    GST_PAD_ALWAYS,
    gst_caps_new (
    "int2float_src",
      "audio/raw",
      gst_props_new (
        "format",     GST_PROPS_STRING ("float"),
        "intercept",  GST_PROPS_FLOAT (0.0),
        "slope",      GST_PROPS_FLOAT (1.0),
        "channels",   GST_PROPS_INT (1),
      NULL)),
    NULL);
}

So having said all of the above I am currently
hardcoding to an intercept of 0.0 and a slope of 1.0 -
but it won't stay like this for long!

One final issue is the "layout" property.  The vast
majority of float data will just be gfloats.  It would
be nice if we could just do:
"layout", GST_PROPS_STRING ("gfloat") 
instead of
"format", GST_PROPS_STRING ("IEEE-754 32-bit") or some
such

This would leave us with the following layouts:
"gfloat"
"gdouble"
"whatever obscure layout that hardware/file format
uses"

so, any thought people?

cheers

____________________________________________________________
Do You Yahoo!?
Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
or your free @yahoo.ie address at http://mail.yahoo.ie