[Mesa-dev] [PATCH] vl/dri3: handle the case of different GPU

Thu Sep 8 08:59:52 UTC 2016

Am 08.09.2016 um 10:42 schrieb Michel Dänzer:
> On 08/09/16 05:05 PM, Christian König wrote:
>> Am 08.09.2016 um 08:23 schrieb Michel Dänzer:
>>> On 08/09/16 01:13 PM, Nayan Deshmukh wrote:
>>>> On Thu, Sep 8, 2016 at 9:03 AM, Michel Dänzer <michel at daenzer.net
>>>> <mailto:michel at daenzer.net>> wrote:
>>>>       On 08/09/16 02:48 AM, Nayan Deshmukh wrote:
>>>>       > use a linear buffer in case of back buffer
>>>>       >
>>>>       > Signed-off-by: Nayan Deshmukh <nayan26deshmukh at gmail.com
>>>> <mailto:nayan26deshmukh at gmail.com>>
>>>>
>>>>       [...]
>>>>
>>>>       > @@ -226,8 +227,13 @@ dri3_alloc_back_buffer(struct
>>>> vl_dri3_screen *scrn)
>>>>       >        goto close_fd;
>>>>       >
>>>>       >     memset(&templ, 0, sizeof(templ));
>>>>       > +   if (scrn->is_different_gpu)
>>>>       > +   templ.bind = PIPE_BIND_RENDER_TARGET |
>>>> PIPE_BIND_SAMPLER_VIEW |
>>>>       > +                PIPE_BIND_SCANOUT | PIPE_BIND_SHARED |
>>>> PIPE_BIND_LINEAR;
>>>>       > +   else
>>>>       >     templ.bind = PIPE_BIND_RENDER_TARGET |
>>>> PIPE_BIND_SAMPLER_VIEW |
>>>>       >                  PIPE_BIND_SCANOUT | PIPE_BIND_SHARED;
>>>>
>>>>       The indentation is wrong. Also, it would be better to make it
>>>> something
>>>>       like this:
>>>>
>>>>          templ.bind = PIPE_BIND_RENDER_TARGET | PIPE_BIND_SAMPLER_VIEW |
>>>>                       PIPE_BIND_SCANOUT | PIPE_BIND_SHARED;
>>>>          if (scrn->is_different_gpu)
>>>>             templ.bind |= PIPE_BIND_LINEAR;
>>>>
>>>>
>>>>       However, as we discussed before, for various reasons it would
>>>> probably
>>>>       be better to create separate linear buffers instead of making all
>>>>       buffers linear.
>>>>
>>>> So should I maintain a single linear buffer and copy the back buffer to
>>>> it before sending it via the present extension?
>>> It's better to create one linear buffer corresponding to each non-linear
>>> buffer with contents to be presented. Otherwise the rendering GPU may
>>> overwrite the linear buffer contents while the presentation GPU is still
>>> reading from it, resulting in tearing-like artifacts.
>> That approach isn't necessary. VDPAU has functions to query if an output
>> surface is still displayed or not.
>>
>> If the application starts to render into a buffer while it is still
>> being displayed tearing-like artifacts are the expected result.
> You're talking about the buffers exposed to applications via VDAPU. I
> was talking about using a single separate linear buffer which would be
> used for presentation of all VDPAU buffers. There's no way for the
> application to know when that's idle.

Ok, yes that makes more sense.

>
>> Additional to that I've made the VDPAU output surfaces linear a while
>> ago anyway, because it showed that tiling actually wasn't beneficial in
>> this use case (a single quad rendered over the whole texture).
> That's fine as long as the buffers are in VRAM, but when they're pinned
> to GTT for sharing between GPUs, rendering to them with the 3D engine
> results in bad PCIe bandwidth utilization, as Marek explained recently.
> So even if the original buffers are already linear, it's better to keep
> those in VRAM and use separate buffers for sharing between GPUs.
>
Mhm at least for VDPAU most compositions should happen on temporary 
buffers anyway when there are any filters enabled.

The problem only occurs when you render with the compositor into the 
output buffer directly. In this case you have up to 16 layers of 
rendering into that buffer.

Anyway I would clearly suggest to handle that in the VDPAU state tracker 
and not in the DRI3 code, cause the handling needed seems to be 
different for VA-API and I would really like to avoid any additional 
copy for 4k playback.

Regards,
Christian.