[gst-devel] Video 3D support

Martin Bisson martin.bisson at gmail.com
Wed May 26 03:34:58 CEST 2010


David Schleef wrote:
> On Tue, May 25, 2010 at 04:29:39PM +0300, Stefan Kost wrote:
>   
>>> I think that overall, using new fourccs would involve writing
>>> less code and be less prone to bugs.  It is my preference.
>>>   
>>>       
>> I Honestly don't like it so much as its badly scales :/ To be sure you
>> mean instead of
>> video/x-raw-yuv, format="I420" we do video/x-raw-yuv,
>> format="S420",layout="over/under" (yeah crap). Or
>> video/x-raw-yuv-stereo, format="I420",layout="over/under" ?
>>     
>
> I meant "video/x-raw-yuv,format=S420,width=1280,height=720" for the
> native way that GStreamer handled stereo video.  (Funny that you
> used S420, as that was exactly the method for mangling fourcc's that
> I was thinking: I420 -> S420, UYVY -> SYVY, etc.)
>   
What about RGB stereo formats?
> And no "layout" property.  We already have a fourcc to indicate layout
> (as well as other stuff, under which rug we now also sweep stereo/mono).
>
> Side-by-side and top-bottom is a misuderstanding of what "native"
> means.  These are methods of packing two pictures into a single
> picture for the purposes of shoving it through software that only
> understands one picture.  We want a system that understands *two*
> pictures.
>   
I wouldn't say that "there are methods of packing two pictures into a 
single picture for the purposes of shoving it through software that 
understands one picture."  I think it's more about how to organise the 
memory used to represent those combined picture, like planar RGB vs 
packed RGB, or YV16 vs YVYU and YUY2.  I agree that side-by-side and 
row-interleaved layouts end up being the same memory layout, so there 
should not be a distinction between the 2.  There would only be a 
distinction, as you said, when you shove this combined image through 
software that only understands one picture.  But top-bottom (or memory 
consecutive, or whatever name we choose) and side-by-syde (or 
left-right, or row-interleaved, or ...) are as different in my opinion 
as packed vs planar layouts.  Does that make sense?
> As a data point, H.264 handles stereo by doubling the number of
> pictures in the stream and ordering them left/right/left/right.  The
> closest match in a GStreamer API would be to use buffer flags, but
> that's gross because a) we don't have any buffer flags available
> unless we steal them from miniobject, b) we still would need a
> field (stereo=true) in the caps, which would cause compatibility
> issues, c) some existing elements would work fine (videoscale),
> others would fail horribly (videorate).
>
> The second closest match is what I recommended: format=(fourcc)S420
> (as above), indicating two I420 pictures consecutive in memory.  A
> stereo H.264 decoder can be modified to decode to these buffers
> easily.
>   
Again, this would be viable but this would involve choosing the layout, 
wouldn't it?
> On the display side, I only really have experience with X output
> to shutter stereo goggles using OpenGL: you upload separate pictures
> for right and left, and the driver flips between them.  In this
> case, the packing in memory is only slightly important -- memory
> consective order would be easier to code, but the graphics engine
> could easily be programmed to handle top/bottom or side-by-side.
>
> HDMI 1.4, curiously, has support for half a dozen stereo layouts.
> This is a design we should strive to avoid.
>   
Wouldn't we want to support what is used elsewhere to augment 
compatibility (at the cost of greater complexity probably, I agree)?
> On the question of "How do I handle side-by-side video":  Use an
> element to convert to the native format.  Let's call it
> 'videotostereo'.  Thus, if you have video like on this page:
> http://www.stereomaker.net/sample/index.html, you would use
> something like:
>
>   filesrc ! decodebin ! videotostereo method=side-by-side !
>     stereoglimagesink
>   
Ok, from what I understand, this means that there a file containing a 
normal video that is actually composed of 2 images that are 
side-by-side, then the videotostereo plugin is informed, through the 
"method" property, that the incoming video has this layout.  Since we 
chose to use put the images in the stereo stream in a memory consecutive 
layout, videotostereo would "reorganise" the incoming buffer into a 
top-bottom outgoing buffer.  Then this buffer, having stereo caps (S420 
for instance), would be sent to stereoglimagesink that would do 
whatever.  Is that right?
> Assuming you don't have stereo output, but want to watch one of
> the mono channels:
>
>   filesrc ! decodebin ! videotostereo method=side-by-side !
>     ffmpegcolorspace ! xvimagesink
>   
The stream coming out of videotostereo would have stereo caps, so this 
means that ffmpegcolorspace would have to be aware of the stereo caps.  
Is that what you meant?  The approach I was going to take was more to 
have another plugin, something like stereotovideo, that would take a 
stereo stream and output it as a normal video, with whatever output 
layout (left-right, right-left, top-bottom, bottom-top, etc.)
> Converting the above video to red/cyan anaglyph:
>
>   filesrc ! decodebin ! videotostereo method=side-by-side !
>     videofromstereo method=red-cyan-anaglyph !  xvimagesink
>
>   
Yeah, that's exactly what I meant (stereotovideo = videofromstereo), 
except that the method for videofromstereo could also be left, right, 
left-right, right-left, top-bottom, top-bottom (in addition to all kinds 
of anaglyph), instead of leaving that task to ffmpeg.

Another question that I had is about more than 2 buffers.  Is it too 
soon to talk about this?  Should we focus on stereo for now?  What about 
n images in a stream?  Because if I'm right, the 2 suggestions I've 
received for new caps are:

1) "video/x-raw-yuv-stereo , layout = { memory-consecutive , interleaved 
}, ..."

which is the one I'm most comfortable with, or

2) "video/x-raw-yuv , format = (fourcc) S420 , ..."

But what about something like :

3) "video/x-raw-yuv-multiple , channels = (int) [ 1 , MAX ] , ..."

or something like that...  Maybe with the "-multiple" to avoid problems 
with existing plugins...  I haven't thought a lot about this one, but I 
just wanted to ask you what you think about the possibility of using 
more than one (or two) images packed together.

Thanks,

Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/gstreamer-devel/attachments/20100525/ff2bbf23/attachment.htm>


More information about the gstreamer-devel mailing list