[Mesa-dev] [PATCH] winsys/amdgpu: add back multithreaded command submission
Marek Olšák
maraeo at gmail.com
Wed May 18 18:23:47 UTC 2016
,On Wed, May 18, 2016 at 7:48 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> On 18.05.2016 11:58, Marek Olšák wrote:
>>
>> On Sat, May 7, 2016 at 5:12 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>>>
>>> Looks good to me, just two remarks below...
>>>
>>>
>>> On 06.05.2016 13:31, Marek Olšák wrote:
>>>>
>>>>
>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>
>>>> Ported from the initial amdgpu winsys from the private AMD branch.
>>>>
>>>> The thread creates the buffer list, submits IBs, and cleans up
>>>> the submission context, which can also destroy buffers.
>>>>
>>>> 3-5% reduction in CPU overhead is expected for apps submitting a lot
>>>> of IBs per frame. This is most visible with DMA IBs.
>>>> ---
>>>> src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 26 ++-
>>>> src/gallium/winsys/amdgpu/drm/amdgpu_bo.h | 4 +
>>>> src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 311
>>>> +++++++++++++++++---------
>>>> src/gallium/winsys/amdgpu/drm/amdgpu_cs.h | 52 +++--
>>>> src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 61 +++++
>>>> src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h | 9 +
>>>> 6 files changed, 333 insertions(+), 130 deletions(-)
>>>>
>>>> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>> b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>> index 37a41c0..ec5fa6a 100644
>>>> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
>>>> @@ -43,8 +43,21 @@ static bool amdgpu_bo_wait(struct pb_buffer *_buf,
>>>> uint64_t timeout,
>>>> {
>>>> struct amdgpu_winsys_bo *bo = amdgpu_winsys_bo(_buf);
>>>> struct amdgpu_winsys *ws = bo->ws;
>>>> + int64_t abs_timeout;
>>>> int i;
>>>>
>>>> + if (timeout == 0) {
>>>> + if (p_atomic_read(&bo->num_active_ioctls))
>>>> + return false;
>>>> +
>>>> + } else {
>>>> + abs_timeout = os_time_get_absolute_timeout(timeout);
>>>> +
>>>> + /* Wait if any ioctl is being submitted with this buffer. */
>>>> + if (!os_wait_until_zero_abs_timeout(&bo->num_active_ioctls,
>>>> abs_timeout))
>>>> + return false;
>>>> + }
>>>
>>>
>>>
>>> I'd suggest to do the cs_sync_flush here instead of below - there is less
>>> action at a distance, and some additional code paths end up covered by a
>>> flush as well.
>>
>>
>> Unfortunately, amdgpu_bo_wait is exposed via the winsys interface and
>> doesn't accept a CS. We could extend it to accept a CS or two, but
>> that would mean adding most of what r600_buffer_map_sync_with_rings is
>> doing. It would be a bigger cleanup and I think it should be done as a
>> separate patch if we wanted to go down that road.
>
>
> Okay, fair enough. Let's keep an eye out for whether some use cases get into
> busy waits, just in case.
>
> The amdgpu_cs_sync_flush should be added to the other branch of the
> PIPE_TRANSFER_WRITE check in amdgpu_bo_map though, right?
Yes, good catch.
Did you mean this with the semaphore?
https://cgit.freedesktop.org/~mareko/mesa/commit/?h=tmp&id=401abd01d5f44430df71de9999b74b6e78a0eac2
Marek
More information about the mesa-dev
mailing list