[igt-dev] [PATCH] xe_exec_reset: Fix cm-gt-reset for LR job behavior
Rodrigo Vivi
rodrigo.vivi at intel.com
Wed Aug 23 19:45:07 UTC 2023
On Tue, Aug 08, 2023 at 03:27:10PM -0700, Matthew Brost wrote:
> Long running jobs in Xe are not recoverable even if the job did not
> trigger the GT reset due to DRM scheduler not tracking LR jobs. Update
> cm-gt-reset to understand all LR jobs are lost after a GT reset.
>
> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> ---
> tests/xe/xe_exec_reset.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/tests/xe/xe_exec_reset.c b/tests/xe/xe_exec_reset.c
> index dfbaa6035..e8faf6209 100644
> --- a/tests/xe/xe_exec_reset.c
> +++ b/tests/xe/xe_exec_reset.c
> @@ -622,8 +622,10 @@ test_compute_mode(int fd, struct drm_xe_engine_class_instance *eci,
> xe_exec(fd, &exec);
> }
>
> - if (flags & GT_RESET)
> + if (flags & GT_RESET) {
> xe_force_gt_reset(fd, eci->gt_id);
> + usleep(150000); /* Let GT reset soak */
do we really need this here? and why?
> + }
>
> if (flags & CLOSE_FD) {
> if (flags & CLOSE_ENGINES) {
> @@ -636,7 +638,7 @@ test_compute_mode(int fd, struct drm_xe_engine_class_instance *eci,
> return;
> }
>
> - for (i = 1; i < n_execs; i++)
> + for (i = 1; i < n_execs && !(flags & GT_RESET); i++)
> xe_wait_ufence(fd, &data[i].exec_sync, USER_FENCE_VALUE,
> NULL, THREE_SEC);
>
> @@ -644,7 +646,7 @@ test_compute_mode(int fd, struct drm_xe_engine_class_instance *eci,
> xe_vm_unbind_async(fd, vm, 0, 0, addr, bo_size, sync, 1);
> xe_wait_ufence(fd, &data[0].vm_sync, USER_FENCE_VALUE, NULL, THREE_SEC);
>
> - for (i = 1; i < n_execs; i++)
> + for (i = 1; i < n_execs && !(flags & GT_RESET); i++)
> igt_assert_eq(data[i].data, 0xc0ffee);
>
> for (i = 0; i < n_engines; i++)
> --
> 2.34.1
>
More information about the igt-dev
mailing list