[RFC v1 4/9] drm/xe/hw_engine_group: Add helper to suspend LR jobs

Matthew Brost matthew.brost at intel.com
Wed Jul 17 19:49:13 UTC 2024


On Wed, Jul 17, 2024 at 03:07:25PM +0200, Francois Dugast wrote:
> This is a required feature for dma fence jobs to preempt long running
> jobs in order to ensure mutual exclusion on a given hw engine group.
> 
> Signed-off-by: Francois Dugast <francois.dugast at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_hw_engine.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> index 4dcc885a55c8..850f7b15b154 100644
> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> @@ -1223,3 +1223,31 @@ int xe_hw_engine_group_del_exec_queue(struct xe_hw_engine_group *group, struct x
>  
>  	return 0;
>  }
> +
> +/**
> + * xe_hw_engine_group_suspend_lr_jobs() - Suspend the long running jobs of this hw engine group
> + * @group: The hw engine group
> + *
> + * Return: 0 on success,
> + *	   -EINVAL if one jobs could not be suspended
> + */
> +static int xe_hw_engine_group_suspend_lr_jobs(struct xe_hw_engine_group *group)
> +{
> +	int err;
> +	struct xe_exec_queue *q;
> +
> +	lockdep_assert_held(&group->mode_sem);
> +
> +	list_for_each_entry(q, &group->exec_queue_list, hw_engine_group_link) {
> +		if (!xe_vm_in_lr_mode(q->vm))
> +			continue;
> +
> +		err = q->ops->suspend(q);
> +		if (err)
> +			return err;

Hmm, this error handling might not be correct. I think it ok for a kill
/ wedge / reset to race here and we'd still want to continue this loop.
Perhaps just add a drm_warn message for now and continue the loop.

Also same deal if suspend_wait() fails.

> +
> +		q->ops->suspend_wait(q);

Pipeline these as suspend() takes a non-zero ammount of time and the GuC
does the suspend async AFIAK (e.g. it issues a suspend, then moves onto
something else, so multiple suspends can be running in parallel).

So...

list_for_each_entry()
	q->ops->suspend(q);

list_for_each_entry()
	q->ops->suspend_wait(q);

Matt

> +	}
> +
> +	return 0;
> +}
> -- 
> 2.43.0
> 


More information about the Intel-xe mailing list