<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#ffffff" text="#000000"> David Schleef wrote: <blockquote cite="mid:20100525222358.GA22372@cooker.entropywave.com" type="cite"> <pre wrap="">On Tue, May 25, 2010 at 04:29:39PM +0300, Stefan Kost wrote: </pre> <blockquote type="cite"> <blockquote type="cite"> <pre wrap="">I think that overall, using new fourccs would involve writing less code and be less prone to bugs. It is my preference. </pre> </blockquote> <pre wrap="">I Honestly don't like it so much as its badly scales :/ To be sure you mean instead of video/x-raw-yuv, format="I420" we do video/x-raw-yuv, format="S420",layout="over/under" (yeah crap). Or video/x-raw-yuv-stereo, format="I420",layout="over/under" ? </pre> </blockquote> <pre wrap=""> I meant "video/x-raw-yuv,format=S420,width=1280,height=720" for the native way that GStreamer handled stereo video. (Funny that you used S420, as that was exactly the method for mangling fourcc's that I was thinking: I420 -> S420, UYVY -> SYVY, etc.) </pre> </blockquote> What about RGB stereo formats? <blockquote cite="mid:20100525222358.GA22372@cooker.entropywave.com" type="cite"> <pre wrap="">And no "layout" property. We already have a fourcc to indicate layout (as well as other stuff, under which rug we now also sweep stereo/mono). Side-by-side and top-bottom is a misuderstanding of what "native" means. These are methods of packing two pictures into a single picture for the purposes of shoving it through software that only understands one picture. We want a system that understands *two* pictures. </pre> </blockquote> I wouldn't say that "there are methods of packing two pictures into a single picture for the purposes of shoving it through software that understands one picture."  I think it's more about how to organise the memory used to represent those combined picture, like planar RGB vs packed RGB, or YV16 vs YVYU and YUY2.  I agree that side-by-side and row-interleaved layouts end up being the same memory layout, so there should not be a distinction between the 2.  There would only be a distinction, as you said, when you shove this combined image through software that only understands one picture.  But top-bottom (or memory consecutive, or whatever name we choose) and side-by-syde (or left-right, or row-interleaved, or ...) are as different in my opinion as packed vs planar layouts.  Does that make sense? <blockquote cite="mid:20100525222358.GA22372@cooker.entropywave.com" type="cite"> <pre wrap="">As a data point, H.264 handles stereo by doubling the number of pictures in the stream and ordering them left/right/left/right. The closest match in a GStreamer API would be to use buffer flags, but that's gross because a) we don't have any buffer flags available unless we steal them from miniobject, b) we still would need a field (stereo=true) in the caps, which would cause compatibility issues, c) some existing elements would work fine (videoscale), others would fail horribly (videorate). The second closest match is what I recommended: format=(fourcc)S420 (as above), indicating two I420 pictures consecutive in memory. A stereo H.264 decoder can be modified to decode to these buffers easily. </pre> </blockquote> Again, this would be viable but this would involve choosing the layout, wouldn't it? <blockquote cite="mid:20100525222358.GA22372@cooker.entropywave.com" type="cite"> <pre wrap="">On the display side, I only really have experience with X output to shutter stereo goggles using OpenGL: you upload separate pictures for right and left, and the driver flips between them. In this case, the packing in memory is only slightly important -- memory consective order would be easier to code, but the graphics engine could easily be programmed to handle top/bottom or side-by-side. HDMI 1.4, curiously, has support for half a dozen stereo layouts. This is a design we should strive to avoid. </pre> </blockquote> Wouldn't we want to support what is used elsewhere to augment compatibility (at the cost of greater complexity probably, I agree)? <blockquote cite="mid:20100525222358.GA22372@cooker.entropywave.com" type="cite"> <pre wrap="">On the question of "How do I handle side-by-side video": Use an element to convert to the native format. Let's call it 'videotostereo'. Thus, if you have video like on this page: <a class="moz-txt-link-freetext" href="http://www.stereomaker.net/sample/index.html">http://www.stereomaker.net/sample/index.html</a>, you would use something like: filesrc ! decodebin ! videotostereo method=side-by-side ! stereoglimagesink </pre> </blockquote> Ok, from what I understand, this means that there a file containing a normal video that is actually composed of 2 images that are side-by-side, then the videotostereo plugin is informed, through the "method" property, that the incoming video has this layout.  Since we chose to use put the images in the stereo stream in a memory consecutive layout, videotostereo would "reorganise" the incoming buffer into a top-bottom outgoing buffer.  Then this buffer, having stereo caps (S420 for instance), would be sent to stereoglimagesink that would do whatever.  Is that right? <blockquote cite="mid:20100525222358.GA22372@cooker.entropywave.com" type="cite"> <pre wrap="">Assuming you don't have stereo output, but want to watch one of the mono channels: filesrc ! decodebin ! videotostereo method=side-by-side ! ffmpegcolorspace ! xvimagesink </pre> </blockquote> The stream coming out of videotostereo would have stereo caps, so this means that ffmpegcolorspace would have to be aware of the stereo caps.  Is that what you meant?  The approach I was going to take was more to have another plugin, something like stereotovideo, that would take a stereo stream and output it as a normal video, with whatever output layout (left-right, right-left, top-bottom, bottom-top, etc.) <blockquote cite="mid:20100525222358.GA22372@cooker.entropywave.com" type="cite"> <pre wrap="">Converting the above video to red/cyan anaglyph: filesrc ! decodebin ! videotostereo method=side-by-side ! videofromstereo method=red-cyan-anaglyph ! xvimagesink </pre> </blockquote> Yeah, that's exactly what I meant (stereotovideo = videofromstereo), except that the method for videofromstereo could also be left, right, left-right, right-left, top-bottom, top-bottom (in addition to all kinds of anaglyph), instead of leaving that task to ffmpeg. Another question that I had is about more than 2 buffers.  Is it too soon to talk about this?  Should we focus on stereo for now?  What about n images in a stream?  Because if I'm right, the 2 suggestions I've received for new caps are: 1) "video/x-raw-yuv-stereo , layout = { memory-consecutive , interleaved }, ..." which is the one I'm most comfortable with, or 2) "video/x-raw-yuv , format = (fourcc) S420 , ..." But what about something like : 3) "video/x-raw-yuv-multiple , channels = (int) [ 1 , MAX ] , ..." or something like that...  Maybe with the "-multiple" to avoid problems with existing plugins...  I haven't thought a lot about this one, but I just wanted to ask you what you think about the possibility of using more than one (or two) images packed together. Thanks, Martin </body> </html>