[PATCH v2] drm/xe: Enlarge the invalidation timeout from 150 to 500
Lucas De Marchi
lucas.demarchi at intel.com
Thu Oct 10 16:17:18 UTC 2024
On Wed, Oct 09, 2024 at 09:30:57PM +0000, Shuicheng Lin wrote:
>There are error messages like below that are occurring during stress testing:
>"[ 31.004009] xe 0000:03:00.0: [drm] ERROR GT0: Global invalidation timeout"
>Changing the timeout value from 150 to 500 could help avoid these error messages
>in the stress tests.
could help or does help?
Can you run the same stress test and get a conclusive message here like:
Previously it was hitting this after <X> executions of
<name-of-stress-tests>. After raising it to 500, <Y> executions passed
and it didn't fail.
thanks
Lucas De Marchi
>
>Due to the way xe_mmio_wait32() is implemented, the timeout is able to expire
>early when the register matches the expected value due to the wait increments
>starting small. So, the larger timeout value should have no effect during
>normal use cases.
>
>Signed-off-by: Shuicheng Lin <shuicheng.lin at intel.com>
>Cc: Matthew Auld <matthew.auld at intel.com>
>Cc: Nirmoy Das <nirmoy.das at intel.com>
>Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>Tested-by: Zongyao Bai <zongyao.bai at intel.com>
>---
> drivers/gpu/drm/xe/xe_device.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>index cd241a8e1838..60aebf7561ec 100644
>--- a/drivers/gpu/drm/xe/xe_device.c
>+++ b/drivers/gpu/drm/xe/xe_device.c
>@@ -925,7 +925,7 @@ void xe_device_l2_flush(struct xe_device *xe)
> spin_lock(>->global_invl_lock);
> xe_mmio_write32(>->mmio, XE2_GLOBAL_INVAL, 0x1);
>
>- if (xe_mmio_wait32(>->mmio, XE2_GLOBAL_INVAL, 0x1, 0x0, 150, NULL, true))
>+ if (xe_mmio_wait32(>->mmio, XE2_GLOBAL_INVAL, 0x1, 0x0, 500, NULL, true))
> xe_gt_err_once(gt, "Global invalidation timeout\n");
> spin_unlock(>->global_invl_lock);
>
>--
>2.25.1
>
More information about the Intel-xe
mailing list