[Intel-xe] [PATCH 3/4] drm/xe/selftests: restart GT after xe_bo_restore_kernel()

Nirmoy Das nirmoy.das at linux.intel.com
Thu Jul 13 15:34:42 UTC 2023


On 7/13/2023 11:41 AM, Matthew Auld wrote:
> Test seems to be failing badly after calling xe_bo_restore_kernel().
> Taking a snapshot of the CTB and copying back a potentially old version
> seems risky, depending on what might have been inflight. Also it seems
> snapshotting the ADS object and copying back results in serious
> breakage. Normally when calling xe_bo_restore_kernel() we always fully
> restart the GT, which re-intializes such things.  We could potentially
> skip saving and restoring such objects in xe_bo_evict_all() however
> seems quite fragile not to also restart the GT. Try to do that here by
> triggering a GT reset.
>
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
Acked-by: Nirmoy Das <nirmoy.das at intel.com>
> ---
>   drivers/gpu/drm/xe/tests/xe_bo.c | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
> index 6aad1443b00e..21c6dfef8dc7 100644
> --- a/drivers/gpu/drm/xe/tests/xe_bo.c
> +++ b/drivers/gpu/drm/xe/tests/xe_bo.c
> @@ -220,7 +220,21 @@ static int evict_test_run_gt(struct xe_device *xe, struct xe_gt *gt, struct kuni
>   			goto cleanup_all;
>   		}
>   
> +		xe_gt_sanitize(gt);
>   		err = xe_bo_restore_kernel(xe);
> +		/*
> +		 * Snapshotting the CTB and copying back a potentially old
> +		 * version seems risky, depending on what might have been
> +		 * inflight. Also it seems snapshotting the ADS object and
> +		 * copying back results in serious breakage. Normally when
> +		 * calling xe_bo_restore_kernel() we always fully restart the
> +		 * GT, which re-intializes such things.  We could potentially
> +		 * skip saving and restoring such objects in xe_bo_evict_all()
> +		 * however seems quite fragile not to also restart the GT. Try
> +		 * to do that here by triggering a GT reset.
> +		 */
> +		xe_gt_reset_async(gt);
> +		flush_work(&gt->reset.worker);
>   		if (err) {
>   			KUNIT_FAIL(test, "restore kernel err=%pe\n",
>   				   ERR_PTR(err));


More information about the Intel-xe mailing list