[PATCH] drm/amdgpu: restore wait-free fastpath for GEM_WAIT_IDLE

Christian König christian.koenig at amd.com
Mon Jan 27 16:14:27 UTC 2025


Am 27.01.25 um 17:02 schrieb Lucas Stach:
> This effectively reverts 0fea2ed61e7f ("drm/amdgpu: Remove call to
> reservation_object_test_signaled_rcu before wait"), as the premise of
> that commit is wrong. dma_resv_wait_timeout() without a timeout will
> not turn into a wait-free dma_resv_test_signaled, but will wait a
> jiffy for unsignaled fences, which is not at all what userspace expects
> when it calls GEM_WAIT_IDLE with a timeout of 0.

Marek pinged me about that strange behavior as well. That sounds like a 
separate bug in dma_resv_wait_timeout() to me.

I don't see why the function should be waiting with a timeout of 0 and 
we have multiple cases where that is assumed.

What should happen is that dma_resv_wait_timeout() should return 1 when 
all fences are signaled even when the timeout is 0.

My educated guess is that this behavior is somehow broken and instead we 
wait for at least 1 jiffies.

This here is probably just the tip of the iceberg.

Regards,
Christian.

>
> Signed-off-by: Lucas Stach <l.stach at pengutronix.de>
> ---
> This is most likely the correct kernel-side solution for the unexpected
> slowdown worked around with in userspace with this Mesa series:
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32877
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 1a5df8b94661..75b5d5149911 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -567,8 +567,13 @@ int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
>   		return -ENOENT;
>   
>   	robj = gem_to_amdgpu_bo(gobj);
> -	ret = dma_resv_wait_timeout(robj->tbo.base.resv, DMA_RESV_USAGE_READ,
> -				    true, timeout);
> +	if (timeout == 0)
> +		ret = dma_resv_test_signaled(robj->tbo.base.resv,
> +					     DMA_RESV_USAGE_READ);
> +	else
> +		ret = dma_resv_wait_timeout(robj->tbo.base.resv,
> +					    DMA_RESV_USAGE_READ,
> +					    true, timeout);
>   
>   	/* ret == 0 means not signaled,
>   	 * ret > 0 means signaled



More information about the amd-gfx mailing list