[Intel-xe] [PATCH v2 3/3] drm/xe/pvc: Use fast copy engines as migrate engine on PVC

Fri Aug 18 12:59:02 UTC 2023

On Thu, Aug 17, 2023 at 09:29:06AM -0700, Matt Roper wrote:
>On Thu, Aug 17, 2023 at 04:03:52AM -0700, Niranjana Vishwanathapura wrote:
>> Some copy hardware engine instances are faster than others on PVC.
>> Use a virtual engine of these plus the reserved instance for the migrate
>> engine on PVC. The idea being if a fast instance is available it will be
>> used and the throughput of kernel copies, clears, and pagefault
>> servicing will be higher.
>>
>> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura at intel.com>
>> ---
>>  drivers/gpu/drm/xe/xe_hw_engine.c | 31 +++++++++++++++++++++++++++++++
>>  drivers/gpu/drm/xe/xe_hw_engine.h |  1 +
>>  drivers/gpu/drm/xe/xe_migrate.c   |  7 ++++---
>>  3 files changed, 36 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
>> index 4c812d04e182..00329efbd968 100644
>> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
>> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
>> @@ -791,3 +791,34 @@ bool xe_hw_engine_is_reserved(struct xe_hw_engine *hwe)
>>  	return xe->info.supports_usm && hwe->class == XE_ENGINE_CLASS_COPY &&
>>  		hwe->instance == gt->usm.reserved_bcs_instance;
>>  }
>> +
>> +/**
>> + * xe_hw_engine_migrate_logical_mask - Return logical mask for migrate engine.
>> + * @gt: gt instance
>> + *
>> + * On PVC, due to workaround 16017236439, odd instance hardware copy engines
>> + * are faster than even instance ones.
>> + * This function returns the mask involving all faster copy engines and the
>> + * reserved copy engine to be used as logical mask for migrate engine.
>> + * Including the reserved copy engine is required to avoid deadlocks due to
>> + * migrate jobs servicing the faults gets stuck behind the job that faulted.
>> + */
>> +u32 xe_hw_engine_migrate_logical_mask(struct xe_gt *gt)
>> +{
>> +	struct xe_device *xe = gt_to_xe(gt);
>> +	u32 logical_mask = 0;
>> +	struct xe_hw_engine *hwe;
>> +	enum xe_hw_engine_id id;
>> +
>> +	/* Only support this function on PVC for now */
>> +	if (XE_WARN_ON(xe->info.platform != XE_PVC))
>
>Here you warn if the platform isn't PVC...
>
>> +		return 0;
>> +
>> +	for_each_hw_engine(hwe, gt, id) {
>> +		if (hwe->class == XE_ENGINE_CLASS_COPY &&
>> +		    (hwe->instance & 1 || xe_gt_is_usm_hwe(gt, hwe)))
>> +			logical_mask |= BIT(hwe->logical_instance);
>> +	}
>> +
>> +	return logical_mask;
>> +}
>> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h
>> index 71968ee2f600..edee71e26c57 100644
>> --- a/drivers/gpu/drm/xe/xe_hw_engine.h
>> +++ b/drivers/gpu/drm/xe/xe_hw_engine.h
>> @@ -52,6 +52,7 @@ void xe_hw_engine_handle_irq(struct xe_hw_engine *hwe, u16 intr_vec);
>>  void xe_hw_engine_enable_ring(struct xe_hw_engine *hwe);
>>  u32 xe_hw_engine_mask_per_class(struct xe_gt *gt,
>>  				enum xe_engine_class engine_class);
>> +u32 xe_hw_engine_migrate_logical_mask(struct xe_gt *gt);
>>
>>  struct xe_hw_engine_snapshot *
>>  xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe);
>> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
>> index 956a96b38346..cb9a82d13089 100644
>> --- a/drivers/gpu/drm/xe/xe_migrate.c
>> +++ b/drivers/gpu/drm/xe/xe_migrate.c
>> @@ -338,11 +338,12 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
>>  							   XE_ENGINE_CLASS_COPY,
>>  							   primary_gt->usm.reserved_bcs_instance,
>>  							   false);
>> -		if (!hwe)
>> +		u32 logical_mask = xe_hw_engine_migrate_logical_mask(primary_gt);
>
>...but here you call the function unconditionally on all platforms...
>
>> +
>> +		if (!hwe || !logical_mask)
>>  			return ERR_PTR(-EINVAL);
>>
>
>...and here we fail out if logical mask isn't set (which is will never
>be on non-PVC).
>

This patch was derived from the below earlier work.
https://patchwork.freedesktop.org/patch/528460/?series=115575&rev=2

It was discussed there.

We should be fine here as currently only PVC Platforms call this
xe_hw_engine_migrate_logical_mask() function. The reason for having a
warning there is to detect any issue while supporting any future platforms
so that we fix that immediately.

>I think you probably want an OOB workaround for some of this logic (see
>xe_wa_oob.rules); then you can do something like
>
>        if (XE_WA(gt, 16017236439))
>
>to guard any code that should only be executed on platforms where this
>workaround applies.
>

Hmm...the previous patch in the series adds this WA (16017236439) as a
engine_was (to be moved to lrc_was). I would like to keep that as that
does the register update nicely (instead of adding additional code for
that).

If at all we need, we need to check here whether that engine/lrc WA is
active. But as you mentioned, XE_WA() only works for OOB WAs.
We could add a duplicate OOB WA here (with same number 16017236439).
But not sure if that is a good idea.
Or, we can add a function to check if a engine/lrc WA is active by
doing the name comparision (instead of XE_WA()). Third option is to
support generating WA index numbering for engine/lrc WAs also (similar
to OOB), so that we can use some macro similar to XE_WA().

What do you suggest?
I think we can leave the code as is for now as it gets called only
for PVC.

Niranjana

>
>Matt
>
>> -		m->q = xe_exec_queue_create(xe, vm,
>> -					    BIT(hwe->logical_instance), 1,
>> +		m->q = xe_exec_queue_create(xe, vm, logical_mask, 1,
>>  					    hwe, EXEC_QUEUE_FLAG_KERNEL);
>>  	} else {
>>  		m->q = xe_exec_queue_create_class(xe, primary_gt, vm,
>> --
>> 2.21.0.rc0.32.g243a4c7e27
>>
>
>-- 
>Matt Roper
>Graphics Software Engineer
>Linux GPU Platform Enablement
>Intel Corporation