[Mesa-dev] [PATCH] winsys/amdgpu: add back multithreaded command submission

Marek Olšák maraeo at gmail.com
Wed May 18 18:23:47 UTC 2016


,On Wed, May 18, 2016 at 7:48 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> On 18.05.2016 11:58, Marek Olšák wrote:
>>
>> On Sat, May 7, 2016 at 5:12 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>>>
>>> Looks good to me, just two remarks below...
>>>
>>>
>>> On 06.05.2016 13:31, Marek Olšák wrote:
>>>>
>>>>
>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>
>>>> Ported from the initial amdgpu winsys from the private AMD branch.
>>>>
>>>> The thread creates the buffer list, submits IBs, and cleans up
>>>> the submission context, which can also destroy buffers.
>>>>
>>>> 3-5% reduction in CPU overhead is expected for apps submitting a lot
>>>> of IBs per frame. This is most visible with DMA IBs.
>>>> ---
>>>>    src/gallium/winsys/amdgpu/drm/amdgpu_bo.c     |  26 ++-
>>>>    src/gallium/winsys/amdgpu/drm/amdgpu_bo.h     |   4 +
>>>>    src/gallium/winsys/amdgpu/drm/amdgpu_cs.c     | 311
>>>> +++++++++++++++++---------
>>>>    src/gallium/winsys/amdgpu/drm/amdgpu_cs.h     |  52 +++--
>>>>    src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c |  61 +++++
>>>>    src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h |   9 +
>>>>    6 files changed, 333 insertions(+), 130 deletions(-)
>>>>
>>>> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>> b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>> index 37a41c0..ec5fa6a 100644
>>>> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>> @@ -43,8 +43,21 @@ static bool amdgpu_bo_wait(struct pb_buffer *_buf,
>>>> uint64_t timeout,
>>>>    {
>>>>       struct amdgpu_winsys_bo *bo = amdgpu_winsys_bo(_buf);
>>>>       struct amdgpu_winsys *ws = bo->ws;
>>>> +   int64_t abs_timeout;
>>>>       int i;
>>>>
>>>> +   if (timeout == 0) {
>>>> +      if (p_atomic_read(&bo->num_active_ioctls))
>>>> +         return false;
>>>> +
>>>> +   } else {
>>>> +      abs_timeout = os_time_get_absolute_timeout(timeout);
>>>> +
>>>> +      /* Wait if any ioctl is being submitted with this buffer. */
>>>> +      if (!os_wait_until_zero_abs_timeout(&bo->num_active_ioctls,
>>>> abs_timeout))
>>>> +         return false;
>>>> +   }
>>>
>>>
>>>
>>> I'd suggest to do the cs_sync_flush here instead of below - there is less
>>> action at a distance, and some additional code paths end up covered by a
>>> flush as well.
>>
>>
>> Unfortunately, amdgpu_bo_wait is exposed via the winsys interface and
>> doesn't accept a CS. We could extend it to accept a CS or two, but
>> that would mean adding most of what r600_buffer_map_sync_with_rings is
>> doing. It would be a bigger cleanup and I think it should be done as a
>> separate patch if we wanted to go down that road.
>
>
> Okay, fair enough. Let's keep an eye out for whether some use cases get into
> busy waits, just in case.
>
> The amdgpu_cs_sync_flush should be added to the other branch of the
> PIPE_TRANSFER_WRITE check in amdgpu_bo_map though, right?

Yes, good catch.

Did you mean this with the semaphore?
https://cgit.freedesktop.org/~mareko/mesa/commit/?h=tmp&id=401abd01d5f44430df71de9999b74b6e78a0eac2

Marek


More information about the mesa-dev mailing list