[RFC 08/20] drm/xe: Runtime PM wake on every exec

Tue Jan 9 18:40:26 UTC 2024

On 09/01/2024 17:41, Rodrigo Vivi wrote:
> On Tue, Jan 09, 2024 at 11:24:34AM +0000, Matthew Auld wrote:
>> On 28/12/2023 02:12, Rodrigo Vivi wrote:
>>> Let's ensure our PCI device is awaken on every GT execution to
>>> the end of the execution.
>>> Let's increase the runtime_pm protection and start moving
>>> that to the outer bounds.
>>>
>>> Let's also remove the unnecessary mem_access get/put.
>>>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>> ---
>>>    drivers/gpu/drm/xe/xe_sched_job.c | 10 +++++-----
>>>    1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_sched_job.c b/drivers/gpu/drm/xe/xe_sched_job.c
>>> index 01106a1156ad8..0b30ec77fc5ad 100644
>>> --- a/drivers/gpu/drm/xe/xe_sched_job.c
>>> +++ b/drivers/gpu/drm/xe/xe_sched_job.c
>>> @@ -15,6 +15,7 @@
>>>    #include "xe_hw_fence.h"
>>>    #include "xe_lrc.h"
>>>    #include "xe_macros.h"
>>> +#include "xe_pm.h"
>>>    #include "xe_trace.h"
>>>    #include "xe_vm.h"
>>> @@ -67,6 +68,8 @@ static void job_free(struct xe_sched_job *job)
>>>    	struct xe_exec_queue *q = job->q;
>>>    	bool is_migration = xe_sched_job_is_migration(q);
>>> +	xe_pm_runtime_put(gt_to_xe(q->gt));
>>> +
>>>    	kmem_cache_free(xe_exec_queue_is_parallel(job->q) || is_migration ?
>>>    			xe_sched_job_parallel_slab : xe_sched_job_slab, job);
>>>    }
>>> @@ -86,6 +89,8 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
>>>    	int i, j;
>>>    	u32 width;
>>> +	xe_pm_runtime_get(gt_to_xe(q->gt));
>>> +
>>
>> This seems way too deep in the call chain. If this actually wakes up the
>> device we will end up with all of the same d3cold deadlock issues. Like here
>> we are for sure holding stuff like dma-resv, but the rpm callbacks also want
>> to grab it. IMO this needs to be something like runtime_get_if_active(),
>> with the upper layers already ensuring device is awake (like ioctl), so here
>> we are just keeping it awake until the job is done. Or maybe this is how it
>> is by the end of the series?
> 
> we have 2 cases here, one that it is already awake by the ioctl and the
> other that is on the eviction preparation and that exit because of the
> 'current' task. So we should be good anyways, but you are right, maybe
> using the get_if_active is better here for clarity.

Yeah, looking at the code it is hard to know if the rpm get can actually 
wake up the device or not, or if waking up only happens in some tricky 
edge case or 1/1000 runs. That is also what is nice about the lockdep 
annotations, since we just assume that all rpm get calls can potentially 
wake the device up and lockdep knows exactly what locks are being held, 
and then so long as we have annotations in the rpm resume/suspend 
callback we can be reasonably sure we have no deadlocks, since it is 
just a question of code coverage, rather then actually needing to hit 
the 0 -> 1 transition for every caller on a real system in CI, which is 
kind of hard.

> 
>>
>>>    	/* only a kernel context can submit a vm-less job */
>>>    	XE_WARN_ON(!q->vm && !(q->flags & EXEC_QUEUE_FLAG_KERNEL));
>>> @@ -155,9 +160,6 @@ struct xe_sched_job *xe_sched_job_create(struct xe_exec_queue *q,
>>>    	for (i = 0; i < width; ++i)
>>>    		job->batch_addr[i] = batch_addr[i];
>>> -	/* All other jobs require a VM to be open which has a ref */
>>> -	if (unlikely(q->flags & EXEC_QUEUE_FLAG_KERNEL))
>>> -		xe_device_mem_access_get(job_to_xe(job));
>>>    	xe_device_assert_mem_access(job_to_xe(job));
>>>    	trace_xe_sched_job_create(job);
>>> @@ -189,8 +191,6 @@ void xe_sched_job_destroy(struct kref *ref)
>>>    	struct xe_sched_job *job =
>>>    		container_of(ref, struct xe_sched_job, refcount);
>>> -	if (unlikely(job->q->flags & EXEC_QUEUE_FLAG_KERNEL))
>>> -		xe_device_mem_access_put(job_to_xe(job));
>>>    	xe_exec_queue_put(job->q);
>>>    	dma_fence_put(job->fence);
>>>    	drm_sched_job_cleanup(&job->drm);