[PATCH] drm/amdgpu: fix ring timeout issue in gfx10 sr-iov environment

Christian König christian.koenig at amd.com
Mon Jan 20 08:59:07 UTC 2025


Am 17.01.25 um 07:05 schrieb cao, lin:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> -----Original Message-----
> From: Lin.Cao <lincao12 at amd.com>
> Sent: Tuesday, January 14, 2025 6:06 PM
> To: amd-gfx at lists.freedesktop.org
> Cc: Koenig, Christian <Christian.Koenig at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; cao, lin <lin.cao at amd.com>
> Subject: [PATCH] drm/amdgpu: fix ring timeout issue in gfx10 sr-iov environment
>
> 'commit 6e66dc05b54f ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare")' set job->vm as NULL if there is no fence. It will cause emit switch buffer be skippen if job->vm set as NULL.
>
> Check job rather than vm could solve this problem.

Good catch.

>
> Signed-off-by: Lin.Cao <lincao12 at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index e0bc37557d2c..2ea98ec60220 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -297,7 +297,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned int num_ibs,
>          amdgpu_ring_patch_cond_exec(ring, cond_exec);
>
>          ring->current_ctx = fence_ctx;
> -       if (vm && ring->funcs->emit_switch_buffer)
> +       if (job && ring->funcs->emit_switch_buffer)

Maybe better to use "job && job->vmid &&"... here.

You should also remove the vm variable and see if there is anything else 
using it.

Regards,
Christian.

>                  amdgpu_ring_emit_switch_buffer(ring);
>
>          if (ring->funcs->emit_wave_limit &&
> --
> 2.46.1
>



More information about the amd-gfx mailing list