A question about the decodebin logic for downstream caps

Thu Nov 1 08:05:39 PDT 2012

Hello,

I have noticed something a problem with the way decodebin sets up the 
caps of its src pads. Their template caps are set as "ANY".

Now, the mpg123audio element can decode to a multitude of formats. The 
idea was to let mpg123audiodec see what downstream prefers, and use that 
for its srcpad. So, if for example a downstream element can only accept 
F32LE data, then thats what the decoder delivers. Since mpg123 has 
multiple code paths for various formats, it seemed illogical to limit 
the src template to one output format. mpg123 itself does not convert 
anything, since the MPEG standard does not mandate any specific format. 
Upstream knows about sample rate and number of channels, but not about 
any specific format.

This works well with simple pipelines, like this:

gst-launch-1.0 filesrc location=/path/to/song.mp3 ! mpegaudioparse ! 
mpg123audiodec ! "audio/x-raw, format=(string)F32LE" ! fakesink

This simulates a downstream element that can only accept F32LE. As 
expected, mpg123audiodec the fixates its srcpad caps to F32LE, all is fine.

This, on the other hand, doesn't work:
gst-launch-1.0 filesrc location=/path/to/song.mp3  ! decodebin ! 
"audio/x-raw, format=(string)F32LE" ! fakesink

Looking at the gst-launch -v output, it turns out that the decodebin 
srcpad uses S16LE as the format. The reason is this: mpg123audiodec 
looks at its peer pad, which in the decodebin case, is a ghost pad with 
caps set to "ANY". It then sees if S16LE is allowed by downstream. Since 
downstream essentially says that any format is OK, S16LE is chosen.

An audioconvert element is necessary:

gst-launch-1.0 filesrc location=/path/to/song.mp3  ! decodebin ! 
audioconvert ! "audio/x-raw, format=(string)F32LE" ! fakesink

But that defeats the whole purpose of multiple formats in the 
mpg123audiodec srccaps!

Any ideas on how to solve this?