[Mesa-dev] [PATCH 09/10] st/vdpau: implement the new DMA-buf based interop

Wed Sep 7 16:59:54 UTC 2016

On Wed, Sep 7, 2016 at 6:23 PM, Christian König <deathsimple at vodafone.de> wrote:
> Am 07.09.2016 um 18:06 schrieb Marek Olšák:
>>
>> On Wed, Sep 7, 2016 at 5:36 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>
>>> On Wed, Sep 7, 2016 at 4:08 AM, Michel Dänzer <michel at daenzer.net> wrote:
>>>>
>>>> On 07/09/16 04:19 AM, Christian König wrote:
>>>>>
>>>>> Am 06.09.2016 um 21:05 schrieb Ilia Mirkin:
>>>>>>
>>>>>> On Tue, Sep 6, 2016 at 2:22 PM, Christian König
>>>>>> <deathsimple at vodafone.de> wrote:
>>>>>>>
>>>>>>> Am 06.09.2016 um 16:23 schrieb Ilia Mirkin:
>>>>>>>>
>>>>>>>> On Mon, Sep 5, 2016 at 2:48 AM, Michel Dänzer <michel at daenzer.net>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> On 05/09/16 04:37 AM, Ilia Mirkin wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, Mar 8, 2016 at 7:21 AM, Christian König
>>>>>>>>>> <deathsimple at vodafone.de> wrote:
>>>>>>>>>>>
>>>>>>>>>>> @@ -80,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
>>>>>>>>>>>        res_tmpl.depth0 = 1;
>>>>>>>>>>>        res_tmpl.array_size = 1;
>>>>>>>>>>>        res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW |
>>>>>>>>>>> PIPE_BIND_RENDER_TARGET |
>>>>>>>>>>> -                   PIPE_BIND_LINEAR;
>>>>>>>>>>> +                   PIPE_BIND_LINEAR | PIPE_BIND_SHARED;
>>>>>>>>>>
>>>>>>>>>> Hi Christian,
>>>>>>>>>>
>>>>>>>>>> This change appears to have semi-broken vdpau on nouveau. Whenever
>>>>>>>>>> I
>>>>>>>>>> flip on the OSD in mplayer, the rendering becomes *extremely*
>>>>>>>>>> slow.
>>>>>>>>>> However regular up-scaling without the OSD is plenty fast. This
>>>>>>>>>> effectively is forcing the output surfaces to live in GART instead
>>>>>>>>>> of
>>>>>>>>>> VRAM.
>>>>>>>>>
>>>>>>>>> Strictly speaking, they'd only need to be forced to GART while
>>>>>>>>> they're
>>>>>>>>> actually being shared between different GPUs. That's how it works
>>>>>>>>> with
>>>>>>>>> the amdgpu and radeon kernel drivers.
>>>>>>>>
>>>>>>>> Any suggestions on how to handle this? Perhaps reallocate + copy the
>>>>>>>> surface in st/vdpau when actual dmabuf sharing is requested?
>>>>>>>>
>>>>>>>> To be clear - with this change, vdpau with nouveau is unusable in
>>>>>>>> the
>>>>>>>> presence of an OSD in mplayer. The OSD comes up whenever you seek
>>>>>>>> around in the video, so in effect, it's unusable. Used to work
>>>>>>>> great.
>>>>>>>
>>>>>>> Well I think you should clearly figure out why adding
>>>>>>> PIPE_BIND_SHARED has
>>>>>>> such dramatic effect.
>>>>>>
>>>>>> Because the buffer goes into GART. And then you try to blend on it,
>>>>>> which involves readback from GART (that's how the functions OSD is
>>>>>> based on work, I believe). We normally don't allocate renderable
>>>>>> surfaces or textures in GART.
>>>>>>
>>>>>>> We not only need this for DMA-buf based interop, but also for the
>>>>>>> DRI3 based
>>>>>>> sharing of buffers with X.
>>>>>>>
>>>>>>> So that clearly sounds like a bug in nouveau to me.
>>>>>>
>>>>>> OK, so SHARED != GART? With nouveau, buffers are placed statically in
>>>>>> either VRAM or GART, so I think that if it's shared it has to end up
>>>>>> in GART, no?
>>>>>
>>>>> As far as I understand it no. Shared just means that we can share it
>>>>> between applications, doesn't it? Or does it mean the buffer should be
>>>>> shareable between GPUs?
>>>>>
>>>>> Could be that my understanding was wrong and so if it's the later feel
>>>>> free to provide a patch to just remove the flag.
>>>>>
>>>>>> I'm pretty weak on all these concepts, as well as how the DRI3 stuff
>>>>>> works, unfortunately.
>>>>>
>>>>> I have to confess I'm not so deeply into this stuff either. Marek,
>>>>> Michel what exactly is the meaning of the flag?
>>>>
>>>> According to src/gallium/docs/source/screen.rst:
>>>>
>>>> * ``PIPE_BIND_SHARED``: A sharable buffer that can be given to another
>>>>    process.
>>>>
>>>> It's also used e.g. for buffers shared via DRI3. So I'm afraid this is
>>>> something nouveau has to deal with better.
>>>
>>> Any suggestions that don't involve rewriting nouveau bo handling at
>>> every level (kernel, ddx, mesa)?
>>>
>>> Otherwise I'll send a revert for this change.
>>
>> PIPE_BIND_SHARED means texture_get_handle is expected to be used on
>> the resource, meaning that inter-API, inter-process, or inter-device
>> sharing is possible. All window back buffers should have the flag. If
>> they don't, it's a bug. If the flag causes nouveau to put the buffer
>> in GART, it's a bug too. There is no reason to use GART for inter-API
>> and inter-process sharing like VDPAU and DRI3 are.
>>
>> To be honest, the flag is pratically useless with respect to EGL and
>> VDPAU, which allow sharing almost any texture.
>
>
> Actually I can think of a very good use case for the kernel where I could
> reduce the per command submission / per BO handle overhead significantly if
> we could mark BOs which are never shared between applications as such.

Sadly, most of OpenGL allocations are sharable. Only internal driver
allocations aren't.

Marek