[RFC PATCH 2/4] drm/scheduler: implement a function to modify sched list

Fri Feb 28 05:08:02 UTC 2020

On 2020-02-27 4:40 p.m., Nirmoy Das wrote:
> implement drm_sched_entity_modify_sched() which can modify existing
> sched_list with a different one. This is going to be helpful when
> userspace changes priority of a ctx/entity then driver can switch to
> corresponding hw shced list for that priority
> 
> Signed-off-by: Nirmoy Das <nirmoy.das at amd.com>
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 24 ++++++++++++++++++++++++
>  include/drm/gpu_scheduler.h              |  4 ++++
>  2 files changed, 28 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 63bccd201b97..711e9d504bcb 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -83,6 +83,30 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>  }
>  EXPORT_SYMBOL(drm_sched_entity_init);
>  
> +/**
> + * drm_sched_entity_modify_sched - Modify sched of an entity
> + *
> + * @entity: scheduler entity to init
> + * @sched_list: the list of new drm scheds which will replace
> + *		existing entity->sched_list
> + * @num_sched_list: number of drm sched in sched_list
> + *
> + * Returns 0 on success or a negative error code on failure.
> + */
> +int drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> +				  struct drm_gpu_scheduler **sched_list,
> +				  unsigned int num_sched_list)
> +{
> +	if (!(entity && sched_list && (num_sched_list == 0 || sched_list[0])))
> +		return -EINVAL;

This seems unmaintainable. I'd write it in its natural form:

int drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
				  struct drm_gpu_scheduler **sched_list,
				  unsigned int num_sched_list)
{
	if (entity && sched_list && (num_sched_list == 0 || sched_list[0] != NULL)) {
		entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
		entity->num_sched_list = num_sched_list;
		return 0;
	} else {
		return -EINVAL;
	}
}

That's too heavy. Can we improve the architecture
so we don't have to check for this in leaf functions like this one?
We can just return a parameterization.

Why would this be called with entity being NULL?
Or with sched_list being NULL? Or num_sched_list being not zero
yet sched_list[0] being NULL? Why not make sure that sched_list[0] is
never NULL and that num_sched_list is greater than 0 always?

Does this make it to user space?
Why would the state of execution be one such that this is true/false
for the code to return -EINVAL?
>From patch 3/4 it seems that an error is printed inside amdgpu_ctx_priority_override()
and the result is not propagated up the stack.

I think we should improve the code where here this condition above
is never true. Then we can use parameterization for those two
statements below:

> +
> +	entity->sched_list = num_sched_list > 1 ? sched_list : NULL;

So if we're here, we know from the top conditional that
either num_sched_list is 0 or that sched_list[0] not NULL
or both.

So if num_sched_list is 0 or 1 we return NULL?
And if num_sched_list is 2 or greater we return sched_list
of which sched_list[0] could be NULL?

Why not fix the architecture so that this is simply copied? (in which case
we can simply index-parameterize it and simply copy it.
Why are there so many checks everywhere?

> +	entity->num_sched_list = num_sched_list;
> +

I mean, all we're doing in this function is initializing
entity->sched_list and entity->num_sched_list. Why does this
function have to be so complex and do so many checks?
Can we improve/fix the architecture instead?

Regards,
Luben

> +	return 0;
> +}
> +EXPORT_SYMBOL(drm_sched_entity_modify_sched);
> +
>  /**
>   * drm_sched_entity_is_idle - Check if entity is idle
>   *
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 589be851f8a1..0c164a96d51b 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -297,6 +297,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>  int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
>  		       void *owner);
> +int drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> +				  struct drm_gpu_scheduler **sched_list,
> +                                  unsigned int num_sched_list);
> +
>  void drm_sched_job_cleanup(struct drm_sched_job *job);
>  void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
>  void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad);
>