[Intel-xe] [PATCH 2/3] drm/xe: fix xe_device_mem_access_get() race

Matthew Auld matthew.auld at intel.com
Fri May 12 15:56:38 UTC 2023


On 12/05/2023 15:32, Rodrigo Vivi wrote:
> On Thu, May 11, 2023 at 12:43:36PM +0100, Matthew Auld wrote:
>> On 05/05/2023 17:44, Rodrigo Vivi wrote:
>>> On Fri, May 05, 2023 at 04:38:53PM +0100, Matthew Auld wrote:
>>>> It looks like there is at least one race here, given that the
>>>> pm_runtime_suspended() check looks to return false if we are in the
>>>> process of suspending the device (RPM_SUSPENDING vs RPM_SUSPENDED). We
>>>> later also do xe_pm_runtime_get_if_active(), but since the device is
>>>> suspending or has now suspended, this doesn't do anything either.
>>>> Following from this we can potentially return from
>>>> xe_device_mem_access_get() with the device suspended or about to be,
>>>> leading to broken behaviour.
>>>>
>>>> Attempt to fix this by always grabbing the runtime ref when our internal
>>>> ref transitions from 0 -> 1, and then wrap the whole thing with a lock
>>>> to ensure callers are serialized.
>>>>
>>>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/258
>>>> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
>>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>>> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_device.c       | 11 +++--------
>>>>    drivers/gpu/drm/xe/xe_device_types.h |  5 ++++-
>>>>    drivers/gpu/drm/xe/xe_pm.c           |  9 ++-------
>>>>    drivers/gpu/drm/xe/xe_pm.h           |  2 +-
>>>>    4 files changed, 10 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>>> index 01c497bcf9a5..0a18b41a0e1a 100644
>>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>>> @@ -406,17 +406,12 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
>>>>    void xe_device_mem_access_get(struct xe_device *xe)
>>>>    {
>>>> -	bool resumed = xe_pm_runtime_resume_if_suspended(xe);
>>>> -
>>>>    	mutex_lock(&xe->mem_access.lock);
>>>> -	if (xe->mem_access.ref++ == 0)
>>>> -		xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
>>>> +	if (xe->mem_access.ref == 0)
>>>> +		xe->mem_access.hold_rpm = xe_pm_runtime_resume_and_get(xe);
>>>> +	xe->mem_access.ref++;
>>>
>>> my memory is good to the point that I have tried this before...
>>> but not good enough to remember the issues that I got with this
>>> approach :(
>>
>> Ok, it seems one big issue is around xe_pm_runtime_resume() et al. The lock
>> fixes all the races, but xe_pm_runtime_resume() seems to call
>> xe_device_mem_access_{get,put}() in loads of places AFAICT, but that is ofc
>> going to deadlock if we introduce a lock, since we are inside the callback.
>> But I think even without that lock it will still deadlock, since the
>> runtime_pm code will see that we are PM_RESUMING and wait for itself. I'm
>> guessing that explains why we had the conditional pm_runtime_suspended() and
>> if_active(), since that prevents triggering the runtime_pm from our
>> callbacks (we will either be PM_SUSPENDING or PM_RESUMING),
> 
> Yes, that was the goal.
> 
>> but then we are
>> ofc left with all the nasty races.
> 
> :(
> 
>> Any ideas? It seems like the
>> resume/suspend callbacks should fundamentally never be calling
>> xe_device_mem_access_{get,put}()?
> 
> We probably need something like the pseudo code in the end of
> Documentation/power/runtime_pm.rst

IIUC that looks like a template for where you have lots of incoming IO 
requests, and you want to process them in an async manner. But to 
process the request you need to also keep the device awake and ensure it 
has resumed, so here it makes sense to use the RPM_ASYNC runtime_get. 
And then in the ->resume() callback it then has a natural place to at 
some later point process the outstanding IO requests.

But for xe_device_mem_access_get() I don't think it can be async, since 
we need to ensure the ->resume() callback has already completed before 
returning to the caller. From the pov of xe_device_mem_access_get() it 
doesn't know if it's being called from ->resume() or some other normal 
path and yet it seems like it somehow needs to differentiate them. I 
feel like either the ->resume() should never have been allowed to call 
it in the first place, or there needs to be some token that always gets 
passed around.

But maybe I'm misunderstanding something here, or at least I'm not 
currently seeing the connection with the pseudo code? Can you share more?

> 
>>
>>>
>>>>    	mutex_unlock(&xe->mem_access.lock);
>>>> -	/* The usage counter increased if device was immediately resumed */
>>>> -	if (resumed)
>>>> -		xe_pm_runtime_put(xe);
>>>> -
>>>>    	XE_WARN_ON(xe->mem_access.ref == S32_MAX);
>>>>    }
>>>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>>>> index 59462933f67a..9e37189d5745 100644
>>>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>>>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>>>> @@ -256,7 +256,10 @@ struct xe_device {
>>>>    	 * triggering additional actions when they occur.
>>>>    	 */
>>>>    	struct {
>>>> -		/** @lock: protect the ref count */
>>>> +		/**
>>>> +		 * @lock: Serialize xe_device_mem_access users,
>>>> +		 * and protect the below internal state, like @ref.
>>>> +		 */
>>>>    		struct mutex lock;
>>>>    		/** @ref: ref count of memory accesses */
>>>>    		s32 ref;
>>>> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>>>> index b7b57f10ba25..b2ffa001e6f7 100644
>>>> --- a/drivers/gpu/drm/xe/xe_pm.c
>>>> +++ b/drivers/gpu/drm/xe/xe_pm.c
>>>> @@ -210,14 +210,9 @@ int xe_pm_runtime_put(struct xe_device *xe)
>>>>    	return pm_runtime_put_autosuspend(xe->drm.dev);
>>>>    }
>>>> -/* Return true if resume operation happened and usage count was increased */
>>>> -bool xe_pm_runtime_resume_if_suspended(struct xe_device *xe)
>>>> +bool xe_pm_runtime_resume_and_get(struct xe_device *xe)
>>>>    {
>>>> -	/* In case we are suspended we need to immediately wake up */
>>>> -	if (pm_runtime_suspended(xe->drm.dev))
>>>> -		return !pm_runtime_resume_and_get(xe->drm.dev);
>>>> -
>>>> -	return false;
>>>> +	return !pm_runtime_resume_and_get(xe->drm.dev);
>>>
>>> now with similar name I feel strange that we are not aligned with their
>>> return. Although I prefer our one...
>>>
>>> Anyway, the code is right... if you are testing and it is working well
>>> let's move with this.
>>>
>>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>> (for the series)
>>>
>>> but let's get an Ack from Maarten since he was kept as author on patch 3
>>> and it is modified from his merged one.
>>>
>>>>    }
>>>>    int xe_pm_runtime_get_if_active(struct xe_device *xe)
>>>> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>>>> index 6a885585f653..1b4c15b5e71a 100644
>>>> --- a/drivers/gpu/drm/xe/xe_pm.h
>>>> +++ b/drivers/gpu/drm/xe/xe_pm.h
>>>> @@ -19,7 +19,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe);
>>>>    int xe_pm_runtime_resume(struct xe_device *xe);
>>>>    int xe_pm_runtime_get(struct xe_device *xe);
>>>>    int xe_pm_runtime_put(struct xe_device *xe);
>>>> -bool xe_pm_runtime_resume_if_suspended(struct xe_device *xe);
>>>> +bool xe_pm_runtime_resume_and_get(struct xe_device *xe);
>>>>    int xe_pm_runtime_get_if_active(struct xe_device *xe);
>>>>    #endif
>>>> -- 
>>>> 2.40.0
>>>>


More information about the Intel-xe mailing list