[gst-devel] proposal: support for row-stride in gstreamer
Rob Clark
rob at ti.com
Fri Jul 3 01:43:17 CEST 2009
Hi gstreamer folks,
The following is a proposal for how to add row-stride (and possibly
some related changes) to gstreamer. I gave a couple of possible
examples of where this would be useful, but it is probably not
exhaustive. Please let me know if you see any cases that I missed, or
details that I overlooked, etc.
Use-cases:
----------
+ display hardware with special constraints on image dimensions, for
example
if the output buffer must have dimensions that are a power of two
+ zero-copy cropping of image / videoframe (at least for interleaved
color
formats.. more on this later)
One example to think about is rendering onto a 3d surface. In some
cases, graphics hardware could require that the surface dimensions are
a power of 2. In this case, you would want the vsink to allocate a
buffer with a rowstride that is the next larger power of 2 from the
image width.
Another example to think about is video stabilization. In this use
case, you would ask the camera to capture an oversized frame. Your
vstab algorithm would calculate an x,y offset of the stabilized
image. But if the decoder understands rowstride, you do not need to
actually copy the image buffer. Say, just to pick some numbers, you
want your final output to be 640x480, and you want your oversized
frame to be +20% in each dimension (768x576):
+--------+ +-------+ +------+
| camera |---------->| vstab |---------->| venc |
+--------+ width=768 +-------+ width=640 +------+
height=576 height=480
rowstride=768 rowstride=768
In the case of an interleaved color format (RGB, UYVY, etc), you could
simply increment the 'data' pointer in the buffer by (y*rowstride)+x.
No memcpy() required. As long as the video encoder respects the
rowstride, it will see the stabilized frame correctly.
Proposal:
---------
In all cases that I can think of, the row-stride will not be changing
dynamically. So this parameter can be negotiated thru caps
negotiation in the same way as image width/height, colorformat, etc.
However, we need to know conclusively that there is no element in the
pipeline that cares about the image format, but does not understand
"rowstride", so we cannot use existing type strings (ex. "video/x-raw-
yuv"). And, at least in the cases that I can think of, the video sink
will dictate the row-stride. So upstream caps-renegotiation will be
used to arrive at the final "rowstride" value.
For media types, I propose to continue using existing strings for non-
stride-aware element caps, ex. "video/x-raw-yuv". For stride-aware
elements, they can support a second format, ex. "video/x-raw-yuv-
strided", "image/x-raw-rgb-strided", etc (ie. append "-strided" to
whatever the existing string is). In the case that a strided format
is negotiated, it is required for there to also be a "rowstride" entry
in the final negotiated caps.
question: in general, most elements supporting rowstride will have no
constraint on what particular rowstride values are supported. Do they
just list "rowstride=[0-4294967295]" in their caps template? The
video sink allocating the buffer will likely have some constraints on
rowstride, although this will be a function of the width (for example,
round the width up to next power of two).
We will implement some sort of GstRowStrideTransform element to
interface between stride-aware and non-stride-aware elements.
Non-Interleaved Color Formats:
------------------------------
So, everything I've said so far about zero-copy cropping works until
you start considering planar/semi-planar color formats which do not
have equal size planes. For example, consider NV12: the offset to
add to the Y plane is (y*rowstride)+x, but the offset to add to UV
plane is ((y/2)*rowstride)+x. There are only three ways that I can
think of to deal with this (listed in order of my preference):
1) add fields to GstBuffer to pass additional pointers to the other
color
planes within the same GstBuffer
2) add a field(s) to GstBuffer to pass an offset.. either an x,y
offset, or a
single value that is (y*rowstride)+x. Either way, the various
elements in
the pipeline can use this to calculate the start of the
individual planes
of data.
3) pass individual planes of a single image as separate
GstBuffer's.. but I'm
not a huge fan of this because now every element needs to have
some sort
of "I've got Y, but I'm waiting for UV" state.
I'm not sure if anyone has any thoughts about which of these three
approaches is preferred. Or any alternative ideas?
BR,
-Rob
More information about the gstreamer-devel
mailing list