[PATCH v5 6/8] drm/i915/gvt: add basic function for weight control

Wed Mar 29 03:26:00 UTC 2017

> From: Ping Gao
> Sent: Tuesday, March 28, 2017 2:48 PM
> 
> This method tries to guarantee precision in second level, with the adjustment
> conducted in every 100ms. At the end of each vGPU switch calculate the
> sched time and subtract it from the time slice allocated; the dedicate time

what's meaning of "dedicate time"? Does it just mean "allocated time"?

> slice for every 100ms together with remaining timeslice, will be used to
> decide how much timeslice allocated to this vGPU in the next 100ms slice,
> with the end goal to guarantee weight ratio in second level.
> 
> Signed-off-by: Ping Gao <ping.a.gao at intel.com>
> ---
>  drivers/gpu/drm/i915/gvt/sched_policy.c | 75
> +++++++++++++++++++++++++++++++++
>  1 file changed, 75 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gvt/sched_policy.c
> b/drivers/gpu/drm/i915/gvt/sched_policy.c
> index f6723f6..c7fd8b8 100644
> --- a/drivers/gpu/drm/i915/gvt/sched_policy.c
> +++ b/drivers/gpu/drm/i915/gvt/sched_policy.c
> @@ -69,6 +69,75 @@ struct gvt_sched_data {
>  	struct list_head lru_vgpu_head;
>  };
> 
> +/* Calc the sched_time during vGPU switch, subtract it
> + * from the time slice allocated correspondingly.

It's not good to mention when this function will be invoked
by caller. Also the logic is obvious. No comment is OK.

> + */
> +static void stat_timeslice_usage(struct intel_vgpu *pre_vgpu) {

vgpu_update_timeslice

> +	ktime_t delta_ts;
> +	struct vgpu_sched_data *vgpu_data = pre_vgpu->sched_data;
> +
> +	delta_ts = vgpu_data->sched_out_time - vgpu_data->sched_in_time;
> +
> +	vgpu_data->sched_time += delta_ts;
> +	vgpu_data->left_ts -= delta_ts;
> +}
> +
> +#define GVT_TS_BALANCE_PERIOD_MS 100
> +
> +/* This function executed every 100ms, to alloc time slice
> + * for next 100ms.
> + */

bad comment. It's caller to decide how often this function will
be invoked. What you should describe is the algorithm implemented
here, e.g. use stage-based approach, etc.

> +static void gvt_timeslice_balance(struct gvt_sched_data *sched_data) {

gvt_balance_timeslice

> +	struct vgpu_sched_data *vgpu_data;
> +	struct list_head *pos;
> +	static uint64_t stage_check;
> +	int stage = stage_check++ % 10;

make 10 as a macro

> +
> +	if (stage == 0) {

add comment to describe what 'stage==0' means and what is done here.

> +		int total_weight = 0;
> +		ktime_t fair_timeslice;
> +
> +		/* Every vgpu should set valid weight at the same time */

is 'at the same time' accurate? or do you really mean "every
vgpu should have its weight set when it is created"?

If that is the case, should we assert here in case the assumption
is not correct?

> +		list_for_each(pos, &sched_data->runq_head) {
> +			vgpu_data = container_of(pos, struct
> vgpu_sched_data, list);
> +
> +			if (vgpu_data->sched_ctl.weight == 0) {
> +				total_weight = 0;
> +				break;
> +			}

Don't understand. Why do you give up with a partial state if
a vGPU doesn't have a weight? shouldn't we report error here
like mentioned above?

> +			total_weight += vgpu_data->sched_ctl.weight;

do we prevent vGPU from being created within this phase? otherwise
there might be a race condition.

> +		}
> +
> +		/* The timeslice accumulation will reset every second */

Again, just explain what is done here. Don't assume the frequency
which might be changed.

> +		list_for_each(pos, &sched_data->runq_head) {
> +			vgpu_data = container_of(pos, struct
> vgpu_sched_data, list);
> +			if (total_weight)
> +				fair_timeslice =
> ms_to_ktime(GVT_TS_BALANCE_PERIOD_MS) *
> +					vgpu_data->sched_ctl.weight /
> +						total_weight;
> +			else
> +				fair_timeslice =
> ms_to_ktime(GVT_TS_BALANCE_PERIOD_MS);

is total_weight=0 a valid scenario or an error we should early catch?

> +
> +			vgpu_data->allocated_ts = fair_timeslice;
> +			vgpu_data->left_ts = vgpu_data->allocated_ts;
> +

what about this vGPU's previous left_ts is non-zero? is it OK to simply
override it? Or if it's not critical, at least need a TODO here explaining
more refinement required in the future to handle such corner case.

> +			/* sched_in_time need reset every second also */
> +			vgpu_data->sched_in_time = ktime_get();

Not like a good action here. sched_in_time is updated when the
vGPU is scheduled in. Why should we touch that logic which
belongs to context switch code?

> +		}
> +	} else {
> +		list_for_each(pos, &sched_data->runq_head) {
> +			vgpu_data = container_of(pos, struct
> vgpu_sched_data, list);
> +
> +			/* timeslice for next 100ms should add the left/debt
> +			 * slice of previous stages.
> +			 */
> +			vgpu_data->left_ts += vgpu_data->allocated_ts;
> +		}
> +	}
> +}
> +
>  static void try_to_schedule_next_vgpu(struct intel_gvt *gvt)  {
>  	struct intel_gvt_workload_scheduler *scheduler = &gvt->scheduler;
> @@ -105,6 +174,7 @@ static void try_to_schedule_next_vgpu(struct
> intel_gvt *gvt)
>  	if (scheduler->current_vgpu) {
>  		vgpu_data = scheduler->current_vgpu->sched_data;
>  		vgpu_data->sched_out_time = cur_time;
> +		stat_timeslice_usage(scheduler->current_vgpu);
>  	}
>  	vgpu_data = scheduler->next_vgpu->sched_data;
>  	vgpu_data->sched_in_time = cur_time;
> @@ -242,6 +312,11 @@ static void tbs_sched_func(struct gvt_sched_data
> *sched_data)
>  	struct intel_gvt_workload_scheduler *scheduler = &gvt->scheduler;
>  	struct vgpu_sched_data *vgpu_data;
>  	struct intel_vgpu *vgpu = NULL;
> +	static uint64_t timer_check;
> +
> +	if (!(timer_check++ % GVT_TS_BALANCE_PERIOD_MS))
> +		gvt_timeslice_balance(sched_data);
> +
> 
>  	/* no active vgpu or has already had a target */
>  	if (list_empty(&sched_data->runq_head) || scheduler->next_vgpu)
> --
> 2.7.4
> 
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev