[PATCH 2/2] drm/xe: Record utilization before destroying the exec queue

Thu Jun 27 20:58:20 UTC 2024

On Thu, Jun 27, 2024 at 06:01:50AM GMT, Matthew Brost wrote:
>On Tue, Jun 25, 2024 at 08:21:35PM -0700, Umesh Nerlige Ramappa wrote:
>> On Tue, Jun 25, 2024 at 05:41:01PM +0000, Matthew Brost wrote:
>> > On Tue, Jun 25, 2024 at 05:24:02PM +0000, Matthew Brost wrote:
>> > > On Wed, Jun 26, 2024 at 12:58:12AM +0800, Umesh Nerlige Ramappa wrote:
>> > > > Current code captures utilization at the exec queue level whenever a job
>> > > > is completed and then accumulates it into the xe file stats whenever the
>> > > > user queries for per client engine utilization. There is a case where
>> > > > utilization may be lost if the exec queue is destroyed before the user
>> > > > queries the utilization. To overcome that, record the utlization when
>> > > > the exec queue is destroyed.
>> > > >
>> > > > To do so
>> > > >
>> > > > 1) Wait for release of all other references to the exec queue. The wait
>> > > >    uses the same timeout as the job scheduling timeout. On timeout, only
>> > > >    a debug message is printed out since this is just a best effort to
>> > > >    capture the utilization prior to destroying the queue.
>> > > > 2) Before releasing the last reference in xe_exec_queue_destroy_ioctl(),
>> > > >    record the utilization in the xe file stats.
>> > > >
>> > > > Fixes: ce62827bc294 ("drm/xe: Do not access xe file when updating exec queue run_ticks")
>> > > > Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>> > > > ---
>> > > >  drivers/gpu/drm/xe/xe_exec_queue.c       | 11 +++++++++++
>> > > >  drivers/gpu/drm/xe/xe_exec_queue_types.h |  2 ++
>> > > >  2 files changed, 13 insertions(+)
>> > > >
>> > > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>> > > > index 4d90a16745d2..f1028eaf2d7f 100644
>> > > > --- a/drivers/gpu/drm/xe/xe_exec_queue.c
>> > > > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>> > > > @@ -69,6 +69,7 @@ static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe,
>> > > >  	q->ops = gt->exec_queue_ops;
>> > > >  	INIT_LIST_HEAD(&q->lr.link);
>> > > >  	INIT_LIST_HEAD(&q->multi_gt_link);
>> > > > +	init_waitqueue_head(&q->wq);
>> > > >
>> > > >  	q->sched_props.timeslice_us = hwe->eclass->sched_props.timeslice_us;
>> > > >  	q->sched_props.preempt_timeout_us =
>> > > > @@ -825,6 +826,7 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
>> > > >  	struct xe_file *xef = to_xe_file(file);
>> > > >  	struct drm_xe_exec_queue_destroy *args = data;
>> > > >  	struct xe_exec_queue *q;
>> > > > +	int ret;
>> > > >
>> > > >  	if (XE_IOCTL_DBG(xe, args->pad) ||
>> > > >  	    XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
>> > > > @@ -838,6 +840,15 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
>> > > >
>> > > >  	xe_exec_queue_kill(q);
>> > > >
>> > > > +	ret = wait_event_timeout(q->wq, kref_read(&q->refcount) == 1,
>> > > > +			(q->sched_props.job_timeout_ms/1000) * HZ);
>> > > > +	if (!ret)
>> > > > +		drm_dbg(&xe->drm, "Timedout waiting for exec queue run ticks update\n");
>> > > > +
>> > >
>> > > I can't say I love adding a wait to destroy IOCTL just to accurately
>> > > report some stats. If we can't preempt this exec queue, the kill flow
>> > > can take nearly .7s which would be pretty bad to block in the IOCTL.
>> >
>> > Opps, my timing calculation is wrong here as I forget that we reduce to
>> > the preemption timeoot to 1ms upon kill. But my point still stands from
>> > an arch perspective blocking in an IOCTL to collect stats doesn't seem
>> > right to me. Maybe others have a different opinion?
>> >
>> > > Even the common case isn't great either...
>> > >
>> > > Beyond that, this code breaks if the assumption of
>> > > kref_read(&q->refcount) == 1 meaning all jobs are done. e.g. The wedging
>> > > code == 2 takes an extra ref to the exec queue, so this IOCTL will hang
>> > > forever even we wedge the device.
>> >
>> > s/even we/if we/
>> >
>> > Matt
>> >
>> > >
>> > > Could this be added to exec queue backend code which adjusts on the
>> > > final destroy? e.g. Add it to __guc_exec_queue_fini_async via
>> > > xe_exec_queue_fini? Or does that not work because we don't have the xef?
>>
>> That's how it was earlier, but the problem we ran into earlier was that
>> xe_file_close would schedule work to kill the queue and in parallel go ahead
>> and close the xef object. By the time, the queue backend wanted to update
>> xef, xef was freed -
>> https://gitlab.freedesktop.org/drm/xe/kernel/issues/1908
>>
>> Maybe if the queue backend somehow knows that the relevant xef is gone, it
>> could handle the above bug, but we do not have a q->xef. It looks like that
>> was intentionally removed a while ago. We were using q->vm->xef in the code
>> that caused the above bug, but that's a hacky way to relate the queue to its
>> xef.
>>
>> To resolve the bug above, I had added code to localize the stats to the
>> queue and transfer those stats to xef only in calls where we knew that the
>> xef was valid - like when the user queries the stats. The only open was that
>> when the queue is destroyed, we will lose the stats stored in the queue if
>> we don't transfer it. That's what is being addressed here.
>>
>> > >
>> > > Regardless if my suggestion works or not, surely we can do something to
>> > > avoid waiting in this IOCTL. I suggest exploring another solution.
>>
>> Lucas had suggested using a ref to xef object, but I think that requires
>> bringing back the link q->xef. If you are okay with that, then I can add
>> some code to make xef ref counted and solve the original issue -
>> https://gitlab.freedesktop.org/drm/xe/kernel/issues/1908. I can drop this
>> series then.
>>
>
>That seems to be a sane choice, in general I think we get into trouble
>when we don't ref count objects.
>
>So the ref counting scheme would roughly be:
>
>- init ref count to xef in xe_file_open
>- drop ref in xe_file_close
>- exec queue takes ref to xef upon creation (user IOCTL created exec queues only)
>- exec queue drops ref to xef upon destroy (again user IOCTL created exec queues only)
>
>Seeme like a good design to me.

we will also neet to get the ref on objects where we store the xef (e.g.
xe_vm_create_ioctl()) and put it when destroying the vm.

ack, but the check for "user IOCTL created exec queues" is implicit as
otherwise q->xef would be NULL.... because this is not coming from any
xef.

Lucas De Marchi

>
>Matt
>
>> Thanks,
>> Umesh
>>
>> > >
>> > > Matt
>> > >
>> > > > +	mutex_lock(&xef->exec_queue.lock);
>> > > > +	xef->run_ticks[q->class] += xe_exec_queue_delta_run_ticks(q);
>> > > > +	mutex_unlock(&xef->exec_queue.lock);
>> > > > +
>> > > >  	trace_xe_exec_queue_close(q);
>> > > >  	xe_exec_queue_put(q);
>> > > >
>> > > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
>> > > > index 201588ec33c3..2ae4221d2f61 100644
>> > > > --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
>> > > > +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
>> > > > @@ -143,6 +143,8 @@ struct xe_exec_queue {
>> > > >  	u64 old_run_ticks;
>> > > >  	/** @run_ticks: hw engine class run time in ticks for this exec queue */
>> > > >  	u64 run_ticks;
>> > > > +	/** @wq: wait queue to wait for cleanup */
>> > > > +	wait_queue_head_t wq;
>> > > >  	/** @lrc: logical ring context for this exec queue */
>> > > >  	struct xe_lrc *lrc[];
>> > > >  };
>> > > > --
>> > > > 2.34.1
>> > > >