v4l2 mem2mem compose support?

Sat Feb 16 21:13:50 UTC 2019

Le sam. 16 févr. 2019 à 13:40, Hans Verkuil <hverkuil at xs4all.nl> a écrit :
>
> On 2/16/19 4:42 PM, Nicolas Dufresne wrote:
> > Le sam. 16 févr. 2019 à 04:48, Hans Verkuil <hverkuil at xs4all.nl> a écrit :
> >>
> >> On 2/16/19 10:42 AM, Hans Verkuil wrote:
> >>> On 2/16/19 1:16 AM, Tim Harvey wrote:
> >>>> Greetings,
> >>>>
> >>>> What is needed to be able to take advantage of hardware video
> >>>> composing capabilities and make them available in something like
> >>>> GStreamer?
> >>>
> >>> Are you talking about what is needed in a driver or what is needed in
> >>> gstreamer? Or both?
> >>>
> >>> In any case, the driver needs to support the V4L2 selection API, specifically
> >>> the compose target rectangle for the video capture.
> >>
> >> I forgot to mention that the driver should allow the compose rectangle to
> >> be anywhere within the bounding rectangle as set by S_FMT(CAPTURE).
> >>
> >> In addition, this also means that the DMA has to be able to do scatter-gather,
> >> which I believe is not the case for the imx m2m hardware.
> >
> > I believe the 2D blitter can take an arbitrary source rectangle and
> > compose it to an arbitrary destination rectangle (a lot of these will
> > in fact use Q16 coordinate, allowing for subpixel rectangle, something
> > that V4L2 does not support).
>
> Not entirely true. I think this can be done through the selection API,
> although it would require some updating of the spec and perhaps the
> introduction of a field or flag. The original VIDIOC_CROPCAP and VIDIOC_CROP
> ioctls actually could do this since with analog video (e.g. S-Video) you
> did not really have the concept of a 'pixel'. It's an analog waveform after
> all. In principle the selection API works in the same way, even though the
> documentation always assumes that the selection rectangles map directly on
> the digitized pixels. I'm not sure if there are still drivers that report
> different crop bounds in CROPCAP compared to actual number of digitized pixels.
> The bttv driver is most likely to do that, but I haven't checked.
>
> Doing so made it very hard to understand, though.
>
>  I don't think this driver exist in any
> > form upstream on IMX side. The Rockchip dev tried to get one in
> > recently, but the discussion didn't go so well with  the rejection of
> > the proposed porter duff controls was probably devoting, as picking
> > the right blending algorithm is the basic of such driver.
>
> I tried to find the reason why the Porter Duff control was dropped in v8
> of the rockchip RGA patch series back in 2017.
>
> I can't find any discussion about it on the mailinglist, so perhaps it
> was discussed on irc.
>
> Do you remember why it was removed?

I'll try and retrace what happened, it was not a nack, and I realize
that "rejection" wasn't the right word, but if I remember, the focus
in the review went fully around this and the fact that it was doing
blending which such API, while the original intention with the driver
was to have CSC, so removing this was basically a way forward.

>
> >
> > I believe a better approach for upstreaming such driver would be to
> > write an M2M spec specific to that type of m2m drivers. That spec
> > would cover scalers and rotators, since unlike the IPUv3 (which I
> > believe you are referring too) a lot of the CSC and Scaler are
> > blitters.
>
> No, I was referring to the imx m2m driver that Phillip is working on.

I'll need  to check what driver Veolab was using, but if it's the
same, then maybe it only do source-over operations using SELECTION as
you described. If I remember their use case, they where doing simple
source-over blending of two video feeds.

Could it be this ?
https://gitlab.com/veo-labs/linux/tree/veobox/drivers/staging/media/imx6/m2m
Is it an ancester of Philipp's driver ?

>
> >
> > Why we need a spec, this is because unlike most of our current driver,
> > the buffers passed to CAPTURE aren't always empty buffers. This may
> > have implementation that are ambiguous in current spec. The second is
> > to avoid having to deal with legacy implementation, like we have with
> > decoders.
>
> Why would this be ambiguous? A driver that can do this can set the
> COMPOSE rectangle for the CAPTURE queue basically anywhere within the
> buffer and V4L2_SEL_TGT_COMPOSE_PADDED either does not exist or is
> equal to the COMPOSE rectangle.
>
> A driver that isn't able to do scatter-gather DMA will overwrite pixels,
> and so COMPOSE_PADDED will be larger than the COMPOSE rectangle and
> thus such a driver cannot be used for composing into a buffer that already
> contains video data.
>
> I might misunderstand you, though.

Slightly, I said that there MAY exist ambiguity, I haven't went
through everything to figure-out yet. But we can also consider that it
needs spec simply because of the lack of classification.

I'm not commenting about the technical aspect of the data transfer, as
I don't know too well these thing. What I'm guessing though is that
you aren't sure whether this can be done "in-place", which means that
the final image and the image we compose to can be the same memory
buffer. If not, I suppose we might not be able to do a blend operation
with the current API, at it would require 3 buffers (hence 3 queues),
with a full write into the final buffer.

I've seen tentative with passing two OUTPUT buffers and getting one
CAPTURE buffer that would be the composition, but that approach is too
inflexible, the source and destination buffers would need to have the
same size. This is a better fit for deinterlacing I suppose.

>
> Regards,
>
>         Hans
>
> >
> >>
> >> Regards,
> >>
> >>         Hans
> >>
> >>>
> >>> Regards,
> >>>
> >>>       Hans
> >>>
> >>>>
> >>>> Philipp's mem2mem driver [1] exposes the IMX IC and GStreamer's
> >>>> v4l2convert element uses this nicely for hardware accelerated
> >>>> scaling/csc/flip/rotate but what I'm looking for is something that
> >>>> extends that concept and allows for composing frames from multiple
> >>>> video capture devices into a single memory buffer which could then be
> >>>> encoded as a single stream.
> >>>>
> >>>> This was made possible by Carlo's gstreamer-imx [2] GStreamer plugins
> >>>> paired with the Freescale kernel that had some non-mainlined API's to
> >>>> the IMX IPU and GPU. We have used this to take for example 8x analog
> >>>> capture inputs, compose them into a single frame then H264 encode and
> >>>> stream it. The gstreamer-imx elements used fairly compatible
> >>>> properties as the GstCompositorPad element to provide a destination
> >>>> rect within the compose output buffer as well as rotation/flip, alpha
> >>>> blending and the ability to specify background fill.
> >>>>
> >>>> Is it possible that some of this capability might be available today
> >>>> with the opengl GStreamer elements?
> >>>>
> >>>> Best Regards,
> >>>>
> >>>> Tim
> >>>>
> >>>> [1] https://patchwork.kernel.org/patch/10768463/
> >>>> [2] https://github.com/Freescale/gstreamer-imx
> >>>>
> >>>
> >>
>