[RFC 08/20] drm/xe: Runtime PM wake on every exec

Tue Jan 9 17:41:53 UTC 2024

On Tue, Jan 09, 2024 at 11:24:34AM +0000, Matthew Auld wrote:
> On 28/12/2023 02:12, Rodrigo Vivi wrote:
> > Let's ensure our PCI device is awaken on every GT execution to
> > the end of the execution.
> > Let's increase the runtime_pm protection and start moving
> > that to the outer bounds.
> > 
> > Let's also remove the unnecessary mem_access get/put.
> > 
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_sched_job.c | 10 +++++-----
> >   1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
> > index 01106a1156ad8..0b30ec77fc5ad 100644
> > --- a/drivers/gpu/drm/xe/xe_sched_job.c
> > +++ b/drivers/gpu/drm/xe/xe_sched_job.c
> > @@ -15,6 +15,7 @@
> >   #include "xe_hw_fence.h"
> >   #include "xe_lrc.h"
> >   #include "xe_macros.h"
> > +#include "xe_pm.h"
> >   #include "xe_trace.h"
> >   #include "xe_vm.h"
> > @@ -67,6 +68,8 @@ static void job_free(struct xe_sched_job *job)
> >   	struct xe_exec_queue *q = job->q;
> >   	bool is_migration = xe_sched_job_is_migration(q);
> > +	xe_pm_runtime_put(gt_to_xe(q->gt));
> > +
> >   	kmem_cache_free(xe_exec_queue_is_parallel(job->q) || is_migration ?
> >   			xe_sched_job_parallel_slab : xe_sched_job_slab, job);
> >   }
> > @@ -86,6 +89,8 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
> >   	int i, j;
> >   	u32 width;
> > +	xe_pm_runtime_get(gt_to_xe(q->gt));
> > +
> 
> This seems way too deep in the call chain. If this actually wakes up the
> device we will end up with all of the same d3cold deadlock issues. Like here
> we are for sure holding stuff like dma-resv, but the rpm callbacks also want
> to grab it. IMO this needs to be something like runtime_get_if_active(),
> with the upper layers already ensuring device is awake (like ioctl), so here
> we are just keeping it awake until the job is done. Or maybe this is how it
> is by the end of the series?

we have 2 cases here, one that it is already awake by the ioctl and the
other that is on the eviction preparation and that exit because of the
'current' task. So we should be good anyways, but you are right, maybe
using the get_if_active is better here for clarity.

> 
> >   	/* only a kernel context can submit a vm-less job */
> >   	XE_WARN_ON(!q->vm && !(q->flags & EXEC_QUEUE_FLAG_KERNEL));
> > @@ -155,9 +160,6 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
> >   	for (i = 0; i < width; ++i)
> >   		job->batch_addr[i] = batch_addr[i];
> > -	/* All other jobs require a VM to be open which has a ref */
> > -	if (unlikely(q->flags & EXEC_QUEUE_FLAG_KERNEL))
> > -		xe_device_mem_access_get(job_to_xe(job));
> >   	xe_device_assert_mem_access(job_to_xe(job));
> >   	trace_xe_sched_job_create(job);
> > @@ -189,8 +191,6 @@ void xe_sched_job_destroy(struct kref *ref)
> >   	struct xe_sched_job *job =
> >   		container_of(ref, struct xe_sched_job, refcount);
> > -	if (unlikely(job->q->flags & EXEC_QUEUE_FLAG_KERNEL))
> > -		xe_device_mem_access_put(job_to_xe(job));
> >   	xe_exec_queue_put(job->q);
> >   	dma_fence_put(job->fence);
> >   	drm_sched_job_cleanup(&job->drm);