[RFC 08/20] drm/xe: Runtime PM wake on every exec

Tue Jan 9 11:24:34 UTC 2024

On 28/12/2023 02:12, Rodrigo Vivi wrote:
> Let's ensure our PCI device is awaken on every GT execution to
> the end of the execution.
> Let's increase the runtime_pm protection and start moving
> that to the outer bounds.
> 
> Let's also remove the unnecessary mem_access get/put.
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> ---
>   drivers/gpu/drm/xe/xe_sched_job.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
> index 01106a1156ad8..0b30ec77fc5ad 100644
> --- a/drivers/gpu/drm/xe/xe_sched_job.c
> +++ b/drivers/gpu/drm/xe/xe_sched_job.c
> @@ -15,6 +15,7 @@
>   #include "xe_hw_fence.h"
>   #include "xe_lrc.h"
>   #include "xe_macros.h"
> +#include "xe_pm.h"
>   #include "xe_trace.h"
>   #include "xe_vm.h"
>   
> @@ -67,6 +68,8 @@ static void job_free(struct xe_sched_job *job)
>   	struct xe_exec_queue *q = job->q;
>   	bool is_migration = xe_sched_job_is_migration(q);
>   
> +	xe_pm_runtime_put(gt_to_xe(q->gt));
> +
>   	kmem_cache_free(xe_exec_queue_is_parallel(job->q) || is_migration ?
>   			xe_sched_job_parallel_slab : xe_sched_job_slab, job);
>   }
> @@ -86,6 +89,8 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
>   	int i, j;
>   	u32 width;
>   
> +	xe_pm_runtime_get(gt_to_xe(q->gt));
> +

This seems way too deep in the call chain. If this actually wakes up the 
device we will end up with all of the same d3cold deadlock issues. Like 
here we are for sure holding stuff like dma-resv, but the rpm callbacks 
also want to grab it. IMO this needs to be something like 
runtime_get_if_active(), with the upper layers already ensuring device 
is awake (like ioctl), so here we are just keeping it awake until the 
job is done. Or maybe this is how it is by the end of the series?

>   	/* only a kernel context can submit a vm-less job */
>   	XE_WARN_ON(!q->vm && !(q->flags & EXEC_QUEUE_FLAG_KERNEL));
>   
> @@ -155,9 +160,6 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
>   	for (i = 0; i < width; ++i)
>   		job->batch_addr[i] = batch_addr[i];
>   
> -	/* All other jobs require a VM to be open which has a ref */
> -	if (unlikely(q->flags & EXEC_QUEUE_FLAG_KERNEL))
> -		xe_device_mem_access_get(job_to_xe(job));
>   	xe_device_assert_mem_access(job_to_xe(job));
>   
>   	trace_xe_sched_job_create(job);
> @@ -189,8 +191,6 @@ void xe_sched_job_destroy(struct kref *ref)
>   	struct xe_sched_job *job =
>   		container_of(ref, struct xe_sched_job, refcount);
>   
> -	if (unlikely(job->q->flags & EXEC_QUEUE_FLAG_KERNEL))
> -		xe_device_mem_access_put(job_to_xe(job));
>   	xe_exec_queue_put(job->q);
>   	dma_fence_put(job->fence);
>   	drm_sched_job_cleanup(&job->drm);