[PATCH] amdgpu: fix multi-process hang issue

Christian König ckoenig.leichtzumerken at gmail.com
Wed Aug 22 12:16:08 UTC 2018


Am 22.08.2018 um 14:07 schrieb Emily Deng:
> SWDEV-146499: hang during multi vulkan process testing
>
> cause:
> the second frame's PREAMBLE_IB have clear-state
> and LOAD actions, those actions ruin the pipeline
> that is still doing process in the previous frame's
> work-load IB.
>
> fix:
> need insert pipeline sync if have context switch for
> SRIOV (because only SRIOV will report PREEMPTION flag
> to UMD)
>
> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
> Signed-off-by: Emily Deng <Emily.Deng at amd.com>

Much better, patch is Reviewed-by: Christian König 
<christian.koenig at amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index 5c22cfd..47817e0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -165,8 +165,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   		return r;
>   	}
>   
> +	need_ctx_switch = ring->current_ctx != fence_ctx;
>   	if (ring->funcs->emit_pipeline_sync && job &&
>   	    ((tmp = amdgpu_sync_get_fence(&job->sched_sync, NULL)) ||
> +	     (amdgpu_sriov_vf(adev) && need_ctx_switch) ||
>   	     amdgpu_vm_need_pipeline_sync(ring, job))) {
>   		need_pipe_sync = true;
>   
> @@ -201,7 +203,6 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
>   	}
>   
>   	skip_preamble = ring->current_ctx == fence_ctx;
> -	need_ctx_switch = ring->current_ctx != fence_ctx;
>   	if (job && ring->funcs->emit_cntxcntl) {
>   		if (need_ctx_switch)
>   			status |= AMDGPU_HAVE_CTX_SWITCH;



More information about the amd-gfx mailing list