[PATCH] drm/xe: Fix xe_assert usage when jobs timeout

Michal Wajdeczko michal.wajdeczko at intel.com
Wed Jan 10 20:22:21 UTC 2024



On 10.01.2024 02:43, Matthew Brost wrote:
> Both kernel and vm jobs should not timeout but it is possible if the
> hardware encounters an error. Do not use asserts in the case rather a
> warn as hardware issues should not result in an assert crashing the
> kernel.

what kind of crash was it ?

xe_assert() uses drm_WARN() which in turn uses WARN()
and XE_WARN_ON() at the end also translates to WARN()

is it due use of xe or xe->drm in xe_assert()?
but then drm_notice() below will also crash

> 
> Fixes: c73acc1eeba5 ("drm/xe: Use Xe assert macros instead of XE_WARN_ON macro")
> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_guc_submit.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 54ffcfcdd41f..751b822c23da 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -928,8 +928,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
>  	int i = 0;
>  
>  	if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags)) {
> -		xe_assert(xe, !(q->flags & EXEC_QUEUE_FLAG_KERNEL));
> -		xe_assert(xe, !(q->flags & EXEC_QUEUE_FLAG_VM && !exec_queue_killed(q)));
> +		XE_WARN_ON(q->flags & EXEC_QUEUE_FLAG_KERNEL);
> +		XE_WARN_ON(q->flags & EXEC_QUEUE_FLAG_VM && !exec_queue_killed(q));
>  
>  		drm_notice(&xe->drm, "Timedout job: seqno=%u, guc_id=%d, flags=0x%lx",
>  			   xe_sched_job_seqno(job), q->guc->id, q->flags);


More information about the Intel-xe mailing list