[Mesa-dev] [PATCH 00/14] Patches for VA-API State Tracker Postproc

Thu Sep 21 11:20:19 UTC 2017

On 09/21/2017 03:03 AM, Mark Thompson wrote:
> On 21/09/17 03:17, Leo Liu wrote:
>> On 09/20/2017 06:11 PM, Mark Thompson wrote:
>>> On 19/09/17 20:04, Leo Liu wrote:
>>>> This series are for VA-API State Tracker Postproc, including:
>>>>
>>>> Deinterlacing I video for transcode;
>>>> Scaling support in postproc for transcode;
>>>> Frame grabber in postproc
>>>>
>>>> Thanks Andy Furniss <adf.lists at gmail.com> for lots of testing on these.
>>>>
>>>> Leo Liu (14):
>>>>     st/va/postproc: use video original size for postprocessing
>>>>     vl/compositor: separate YUV part from shader video buffer function
>>>>     vl/compositor: extend YUV deint function to do field deint
>>>>     vl/compositor: add a new function for YUV deint
>>>>     st/omx: use new vl_compositor_yuv_deint_full() to deint
>>>>     st/va: use new vl_compositor_yuv_deint_full() to deint
>>>>     vl/compositor: remove vl_compositor_yuv_deint() function
>>>>     vl/compositor: add Bob top and bottom to YUV deint function
>>>>     st/va/postproc: add a full NV12 deint support from buffer I to P
>>>>     st/va: make internal func vlVaHandleSurfaceAllocate() call simpler
>>>>     st/va/postproc: use progressive target buffer for scaling
>>>>     vl/compositor: create RGB to YUV fragment shader
>>>>     vl/compositor: convert RGB buffer to YUV with color conversion
>>>>     st/va/postproc: implement the DRM prime grabber
>>>>
>>>>    src/gallium/auxiliary/vl/vl_compositor.c          | 263 +++++++++++++++++-----
>>>>    src/gallium/auxiliary/vl/vl_compositor.h          |  50 +++-
>>>>    src/gallium/state_trackers/omx_bellagio/vid_dec.c |  11 +-
>>>>    src/gallium/state_trackers/va/picture.c           |  16 +-
>>>>    src/gallium/state_trackers/va/postproc.c          |  69 +++++-
>>>>    src/gallium/state_trackers/va/surface.c           |   7 +-
>>>>    src/gallium/state_trackers/va/va_private.h        |   2 +-
>>>>    7 files changed, 331 insertions(+), 87 deletions(-)
>>>>
>>> Looks good for import from a bit of testing so far (with the update today).
>>>
>>>
>>> Something funny going on with RGB upload cases?  With ffmpeg:
>>>
>>> ./ffmpeg_g -y -i in.mp4 -an -vaapi_device /dev/dri/renderD129 -vf format=bgr0,hwupload,scale_vaapi=w=1920:h=1080:format=nv12 -c:v h264_vaapi -profile:v 578 -bf 0 out.mp4
>>>
>>> it crashes a few lines into copying to the image.
>>>
>>> The mapping in vlVaMapBuffer() looks like:
>>>
>>> (gdb) p *buf->derived_surface.resource
>>> $9 = {reference = {count = 5}, screen = 0x555557829010, width0 = 1920, height0 = 1088, depth0 = 1, array_size = 1, format = PIPE_FORMAT_B8G8R8X8_UNORM, target = PIPE_TEXTURE_2D, last_level = 0, nr_samples = 0, usage = 0, bind = 2097152, flags = 0, next = 0x0}
>>> (gdb) p *buf->derived_surface.transfer
>>> $8 = {resource = 0x555557d8e2c0, level = 0, usage = PIPE_TRANSFER_WRITE, box = {x = 0, y = 0, z = 0, width = 1920, height = 1, depth = 1}, stride = 7680, layer_stride = 7680}
>>>
>>> height = 1 looks suspicious, like it's only mapping the first line?
>> Looks like the command line crashed at some point where is before you would to go. i.e RGB->YUV in postproc.
> th
> I'm not quite understanding what you mean.  Do you crash at a different point rather than in the copy after mapping the the image to upload to?  Backtrace?
I haven't tried your command yet, but I know it won't work. If we would 
like to have raw RGB to scale in postproc , the raw data have to be in 
RGB surface.

Currently the case we support in RGB format in vaCreateSurface is the 
passing of the dma-buf handle, not support allocate RGB surface.

Even though we got luckily enough, the command line can pass through, it 
will put rgb data to nv12 surface ( the driver should explicitly return 
invalid surface for this case).

>>> A general question for the whole driver: why are surfaces interlaced by default?
>> I think it's firmware preferred, and they are also good for deinterlacing.
> Can you be more specific?

Take a look "rvid_get_video_param()" from radeon_video.c, that will tell 
what interlaced format HW support and prefer

> I agree that it is required for deinterlacing, but that isn't a particularly common case and will only become less so with time.  E.g. is it somehow better to decode even progressive video to interlaced frames?  That seems like it would have significantly worse locality of reference to me, but maybe the hardware does something special.
>
>>> I may be getting some things wrong here, but the relevant components which deal with surfaces that I see are:
>>> H
>>> * Decoder: can write either format, the stream type doesn't seem to matter (?).
>> Normally, HW decoder write to NV12, P016, and for Mjpeg it can do YUYV as well. Stream type depends on codecs HW supports
> All in interlaced and progressive forms?  I didn't consider it earlier, but the H.265 decoder seems to always produce progressive for me.

Again it depends on HW, State Tracker query driver what it supports and 
prefers, then make decision how to create surface.

Regards,
Leo

>
>>> * Encoder: can only accept progressive surfaces.
>>> * Deinterlacer: only works on interlaced surfaces (?).
>> Yes, if you would like to have a pretty picture for 'deinterlace_vappi=mode=3'
>>> * Scaler: can work on either.
>>> * Import: will pretty much always be progressive unless forced not to be (noone is going to make the interlaced format externally unless they get forced into it).
>> If the import usages are for encoder, it have to progressive,
> Typically it isn't directly for the encoder because few things directly produce the necessary formats - a postproc step for some colour-conversion is very likely to happen first.
>>> * Export: works for either, but interlaced is likely much harder for others to use.
> For some use for playback, see <https://github.com/01org/libva/pull/85>, <https://lists.freedesktop.org/archives/mesa-dev/2017-September/169953.html>.  The mpv use-case really wants just the native format without any copying (to not pointlessly waste resources); if that has to be the interlaced form then that should be ok (though I don't know if other projects might disagree on that one), but it would need changes to the libva API to be able to express it.
>
>>> The current copy cases (progressive -> interlaced) with the interlaced default, then, are after most imports and on encode, and it was suggested that we want to do it before export as well.  If surfaces were instead progressive by default then I think the only copy necessary would be before the deinterlacer.  Since most streams nowadays are progressive (including all >1080-line high-resolution streams, where performance matters most), and that proportion is only going to increase, improving other parts for a decrease in performance of the deinterlacer seems like a pretty good tradeoff to me.
>>>
>>> Thoughts?
> Thanks,
>
> - Mark