[Mesa-dev] [PATCH] winsys/amdgpu: add back multithreaded command submission
Nicolai Hähnle
nhaehnle at gmail.com
Thu May 19 14:40:55 UTC 2016
On 18.05.2016 13:23, Marek Olšák wrote:
> ,On Wed, May 18, 2016 at 7:48 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>> On 18.05.2016 11:58, Marek Olšák wrote:
>>>
>>> On Sat, May 7, 2016 at 5:12 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>>>>
>>>> Looks good to me, just two remarks below...
>>>>
>>>>
>>>> On 06.05.2016 13:31, Marek Olšák wrote:
>>>>>
>>>>>
>>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>>
>>>>> Ported from the initial amdgpu winsys from the private AMD branch.
>>>>>
>>>>> The thread creates the buffer list, submits IBs, and cleans up
>>>>> the submission context, which can also destroy buffers.
>>>>>
>>>>> 3-5% reduction in CPU overhead is expected for apps submitting a lot
>>>>> of IBs per frame. This is most visible with DMA IBs.
>>>>> ---
>>>>> src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 26 ++-
>>>>> src/gallium/winsys/amdgpu/drm/amdgpu_bo.h | 4 +
>>>>> src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 311
>>>>> +++++++++++++++++---------
>>>>> src/gallium/winsys/amdgpu/drm/amdgpu_cs.h | 52 +++--
>>>>> src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 61 +++++
>>>>> src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h | 9 +
>>>>> 6 files changed, 333 insertions(+), 130 deletions(-)
>>>>>
>>>>> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>>> b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>>> index 37a41c0..ec5fa6a 100644
>>>>> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>>> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>>> @@ -43,8 +43,21 @@ static bool amdgpu_bo_wait(struct pb_buffer *_buf,
>>>>> uint64_t timeout,
>>>>> {
>>>>> struct amdgpu_winsys_bo *bo = amdgpu_winsys_bo(_buf);
>>>>> struct amdgpu_winsys *ws = bo->ws;
>>>>> + int64_t abs_timeout;
>>>>> int i;
>>>>>
>>>>> + if (timeout == 0) {
>>>>> + if (p_atomic_read(&bo->num_active_ioctls))
>>>>> + return false;
>>>>> +
>>>>> + } else {
>>>>> + abs_timeout = os_time_get_absolute_timeout(timeout);
>>>>> +
>>>>> + /* Wait if any ioctl is being submitted with this buffer. */
>>>>> + if (!os_wait_until_zero_abs_timeout(&bo->num_active_ioctls,
>>>>> abs_timeout))
>>>>> + return false;
>>>>> + }
>>>>
>>>>
>>>>
>>>> I'd suggest to do the cs_sync_flush here instead of below - there is less
>>>> action at a distance, and some additional code paths end up covered by a
>>>> flush as well.
>>>
>>>
>>> Unfortunately, amdgpu_bo_wait is exposed via the winsys interface and
>>> doesn't accept a CS. We could extend it to accept a CS or two, but
>>> that would mean adding most of what r600_buffer_map_sync_with_rings is
>>> doing. It would be a bigger cleanup and I think it should be done as a
>>> separate patch if we wanted to go down that road.
>>
>>
>> Okay, fair enough. Let's keep an eye out for whether some use cases get into
>> busy waits, just in case.
>>
>> The amdgpu_cs_sync_flush should be added to the other branch of the
>> PIPE_TRANSFER_WRITE check in amdgpu_bo_map though, right?
>
> Yes, good catch.
>
> Did you mean this with the semaphore?
> https://cgit.freedesktop.org/~mareko/mesa/commit/?h=tmp&id=401abd01d5f44430df71de9999b74b6e78a0eac2
Yes, that looks good to me. With that change, the patch is
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
> Marek
>
More information about the mesa-dev
mailing list