Concept for a universal auto converter bin element

Fri Jul 20 12:53:22 UTC 2018

Hi Folks,

I'd like some feedback on concept I'm developing, and I want to know if 
anything similar to this has been discussed previously.

I want to make a flexible conversion element that is able to take an 
input frame and convert it into one or more output formats in the most 
efficient manner possible.

Here is a series of examples:

1. An input stream of video/x-raw,width=1920,height=1080,format=YUY2 and 
two request pads with video/x-raw,width=854,height=480,colorspace=GRAY8, 
and another with video/x-raw,width=427,height=240,colorspace=NV12 .

Perhaps the most efficient conversion would be to use a videoscale 
element to make a 480p YUY2 intermediate. Then use a videoconvert to 
convert it to GRAY8. The for the 240p NV12 image, add a second 
videoscale and videoconvert to the 480p YUY2 intermediate.

2. A similar principle would apply to framerate caps.

3. A similar principle would apply to encoded video. If the caps 
required an H.264 output, the element would automatically instantiate an 
encoder chain. And a decoder chain for H.264 input, and a pass-through 
where the input and output are both coded in H.264.

4. This bin element would support multiple inputs - which are assumed to 
contain different representations of the same frame. For example, if 
1080p and a 720p versions of image are provided at the input, and 480p 
images are required at the output, the bin will automatically use the 
720p image as the source for video scaling. An algorithm would be used 
to estimate the minimum effort path for the required conversion.

5. If hardware acceleration is available for any of the conversion 
steps: decode, encode, color convert, or resize, it should auto-select this.

In other words, I'd like to have a universal converter element that will 
automatically instantiate the elements necessary to convert the input to 
the output caps. The concept would be similar to decodebin, but more 
extensive.

The application for this work is for a video analytics application.

So here are my questions:

1. Has anything like this been discussed, or written previously?

2. Does this concept sound reasonable? Or is this covered by other 
gstreamer functionality? Is there an element that fully supports this 
concept, or one I could extend?

3. If I did develop this concept, is this of interest to the up-stream 
gstreamer project? If so, should this work go into gst-plugins-bad? or 
somewhere else?

Thanks
Joel Holdsworth