[PATCH] drm/xe: skip error capture when exec queue is killed

Matthew Brost matthew.brost at intel.com
Tue Apr 30 15:46:27 UTC 2024


On Thu, Apr 25, 2024 at 05:59:31PM +0530, Tejas Upadhyay wrote:
> When user closes exec queue soon after job submission,
> we are generating error coredump. Instead check if
> exec queue is killed during job timeout then skip
> error coredump capture, just free the job and return
> proper scheduler state.
> 
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>

Reviewed-by: Matthew Brost <matthew.brost at intel.com>

> ---
>  drivers/gpu/drm/xe/xe_guc_submit.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 93e1ee183e4a..376a2c04e899 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -971,7 +971,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>  	 * TDR has fired before free job worker. Common if exec queue
>  	 * immediately closed after last fence signaled.
>  	 */
> -	if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags)) {
> +	if (exec_queue_killed(q) || 
> +	    test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags)) {
>  		guc_exec_queue_free_job(drm_job);
>  
>  		return DRM_GPU_SCHED_STAT_NOMINAL;
> -- 
> 2.25.1
> 


More information about the Intel-xe mailing list