[PATCH] drm/xe: Fix xe_assert usage when jobs timeout
Michal Wajdeczko
michal.wajdeczko at intel.com
Wed Jan 10 20:22:21 UTC 2024
On 10.01.2024 02:43, Matthew Brost wrote:
> Both kernel and vm jobs should not timeout but it is possible if the
> hardware encounters an error. Do not use asserts in the case rather a
> warn as hardware issues should not result in an assert crashing the
> kernel.
what kind of crash was it ?
xe_assert() uses drm_WARN() which in turn uses WARN()
and XE_WARN_ON() at the end also translates to WARN()
is it due use of xe or xe->drm in xe_assert()?
but then drm_notice() below will also crash
>
> Fixes: c73acc1eeba5 ("drm/xe: Use Xe assert macros instead of XE_WARN_ON macro")
> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> ---
> drivers/gpu/drm/xe/xe_guc_submit.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 54ffcfcdd41f..751b822c23da 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -928,8 +928,8 @@ guc_exec_queue_timedout_job(struct drm_sched_job *drm_job)
> int i = 0;
>
> if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &job->fence->flags)) {
> - xe_assert(xe, !(q->flags & EXEC_QUEUE_FLAG_KERNEL));
> - xe_assert(xe, !(q->flags & EXEC_QUEUE_FLAG_VM && !exec_queue_killed(q)));
> + XE_WARN_ON(q->flags & EXEC_QUEUE_FLAG_KERNEL);
> + XE_WARN_ON(q->flags & EXEC_QUEUE_FLAG_VM && !exec_queue_killed(q));
>
> drm_notice(&xe->drm, "Timedout job: seqno=%u, guc_id=%d, flags=0x%lx",
> xe_sched_job_seqno(job), q->guc->id, q->flags);
More information about the Intel-xe
mailing list