[gst-devel] proposal: support for row-stride in gstreamer

Wed Jul 22 03:44:26 CEST 2009

btw, I noticed the proposal about GstBuffer metadata:

  <http://cgit.freedesktop.org/gstreamer/gstreamer/tree/docs/design/draft-buffer2.txt 
 >

this provides a nice way to handle x,y vstab coordinates mentioned  
below.

In the cases that I can think of, rowstride would not change from  
buffer to buffer, so I would propose to handle this in caps  
negotation, as proposed originally.  But I would like to use per- 
buffer meta-data to handle the per-buffer cropping/panning.

Comments?

BR,
-R

On Jul 2, 2009, at 6:43 PM, Clark, Rob wrote:

> Hi gstreamer folks,
>
> The following is a proposal for how to add row-stride (and possibly
> some related changes) to gstreamer.  I gave a couple of possible
> examples of where this would be useful, but it is probably not
> exhaustive.  Please let me know if you see any cases that I missed, or
> details that I overlooked, etc.
>
>
>
> Use-cases:
> ----------
>  + display hardware with special constraints on image dimensions, for
> example
>    if the output buffer must have dimensions that are a power of two
>  + zero-copy cropping of image / videoframe (at least for interleaved
> color
>    formats.. more on this later)
>
> One example to think about is rendering onto a 3d surface.  In some
> cases, graphics hardware could require that the surface dimensions are
> a power of 2.  In this case, you would want the vsink to allocate a
> buffer with a rowstride that is the next larger power of 2 from the
> image width.
>
>
> Another example to think about is video stabilization.  In this use
> case, you would ask the camera to capture an oversized frame.  Your
> vstab algorithm would calculate an x,y offset of the stabilized
> image.  But if the decoder understands rowstride, you do not need to
> actually copy the image buffer.  Say, just to pick some numbers, you
> want your final output to be 640x480, and you want your oversized
> frame to be +20% in each dimension (768x576):
>
>    +--------+           +-------+           +------+
>    | camera |---------->| vstab |---------->| venc |
>    +--------+ width=768 +-------+ width=640 +------+
>              height=576          height=480
>           rowstride=768       rowstride=768
>
> In the case of an interleaved color format (RGB, UYVY, etc), you could
> simply increment the 'data' pointer in the buffer by (y*rowstride)+x.
> No memcpy() required.  As long as the video encoder respects the
> rowstride, it will see the stabilized frame correctly.
>
>
>
> Proposal:
> ---------
>
> In all cases that I can think of, the row-stride will not be changing
> dynamically.  So this parameter can be negotiated thru caps
> negotiation in the same way as image width/height, colorformat, etc.
> However, we need to know conclusively that there is no element in the
> pipeline that cares about the image format, but does not understand
> "rowstride", so we cannot use existing type strings (ex. "video/x-raw-
> yuv").  And, at least in the cases that I can think of, the video sink
> will dictate the row-stride.  So upstream caps-renegotiation will be
> used to arrive at the final "rowstride" value.
>
> For media types, I propose to continue using existing strings for non-
> stride-aware element caps, ex. "video/x-raw-yuv".  For stride-aware
> elements, they can support a second format, ex. "video/x-raw-yuv-
> strided", "image/x-raw-rgb-strided", etc (ie. append "-strided" to
> whatever the existing string is).  In the case that a strided format
> is negotiated, it is required for there to also be a "rowstride" entry
> in the final negotiated caps.
>
> question: in general, most elements supporting rowstride will have no
> constraint on what particular rowstride values are supported.  Do they
> just list "rowstride=[0-4294967295]" in their caps template?  The
> video sink allocating the buffer will likely have some constraints on
> rowstride, although this will be a function of the width (for example,
> round the width up to next power of two).
>
> We will implement some sort of GstRowStrideTransform element to
> interface between stride-aware and non-stride-aware elements.
>
>
>
>
> Non-Interleaved Color Formats:
> ------------------------------
>
> So, everything I've said so far about zero-copy cropping works until
> you start considering planar/semi-planar color formats which do not
> have equal size planes.  For example, consider NV12:  the offset to
> add to the Y plane is (y*rowstride)+x, but the offset to add to UV
> plane is ((y/2)*rowstride)+x.  There are only three ways that I can
> think of to deal with this (listed in order of my preference):
>
>  1) add fields to GstBuffer to pass additional pointers to the other
> color
>     planes within the same GstBuffer
>  2) add a field(s) to GstBuffer to pass an offset..  either an x,y
> offset, or a
>     single value that is (y*rowstride)+x.  Either way, the various
> elements in
>     the pipeline can use this to calculate the start of the
> individual planes
>     of data.
>  3) pass individual planes of a single image as separate
> GstBuffer's.. but I'm
>     not a huge fan of this because now every element needs to have
> some sort
>     of "I've got Y, but I'm waiting for UV" state.
>
> I'm not sure if anyone has any thoughts about which of these three
> approaches is preferred.  Or any alternative ideas?
>
>
>
> BR,
> -Rob
>