[Intel-gfx] [PATCH 1/2] drm/i915: Fix NPD in PMU during driver teardown

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Aug 5 09:26:47 UTC 2022


On 04/08/2022 19:56, Summers, Stuart wrote:
> On Thu, 2022-08-04 at 09:42 +0100, Tvrtko Ursulin wrote:
>> On 04/08/2022 00:03, Stuart Summers wrote:
>>> In the driver teardown, we are unregistering the gt prior
>>> to unregistering the PMU. This means there is a small window
>>> of time in which the application can request metrics from the
>>> PMU, some of which are calling into the uapi engines list,
>>> while the engines are not available. In this case we can
>>> see null pointer dereferences.
>>>
>>> Fix this ordering in both the driver load and unload sequences.
>>>
>>> v1: Actually address the driver load/unload ordering issue
>>> v2: Remove the NULL checks since they shouldn't be necessary
>>>       now with the proper ordering
>>>
>>> Fixes: 42014f69bb235 ("drm/i915: Hook up GT power management")
>>
>> What happened in this commit to break it?
> 
> Hm.. well this was the patch that added the abstraction ordering issue,
> i.e. gt_register/unregister. There isn't anything specifically broken
> here since we don't actually access the gt structure underneath. My
> assertion is that this patch should have taken the expansion of the gt
> structure into consideration and added the correct ordering with
> respect to the pmu.
> 
> Are you asking for the specific patch where things stopped working
> functionally?

No, I was looking for the info on what is actually broken with the 
ordering, as is since I couldn't spot it myself yesterday. In other 
words, with patch 2/2 applied - does 1/2 fix anything further for the 
PMU use cases?

Regards,

Tvrtko

>>> Signed-off-by: Stuart Summers <stuart.summers at intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_driver.c | 11 ++++++-----
>>>    1 file changed, 6 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_driver.c
>>> b/drivers/gpu/drm/i915/i915_driver.c
>>> index deb8a8b76965a..ee4dcb85d2060 100644
>>> --- a/drivers/gpu/drm/i915/i915_driver.c
>>> +++ b/drivers/gpu/drm/i915/i915_driver.c
>>> @@ -717,7 +717,6 @@ static void i915_driver_register(struct
>>> drm_i915_private *dev_priv)
>>>    	struct drm_device *dev = &dev_priv->drm;
>>>    
>>>    	i915_gem_driver_register(dev_priv);
>>> -	i915_pmu_register(dev_priv);
>>>    
>>>    	intel_vgpu_register(dev_priv);
>>>    
>>> @@ -731,11 +730,12 @@ static void i915_driver_register(struct
>>> drm_i915_private *dev_priv)
>>>    	i915_debugfs_register(dev_priv);
>>>    	i915_setup_sysfs(dev_priv);
>>>    
>>> +	intel_gt_driver_register(to_gt(dev_priv));
>>> +
>>>    	/* Depends on sysfs having been initialized */
>>> +	i915_pmu_register(dev_priv);
>>>    	i915_perf_register(dev_priv);
>>>    
>>> -	intel_gt_driver_register(to_gt(dev_priv));
>>> -
>>
>> There was a bit of a time distance since we originally discussed this
>> so
>> things kind of evaporated from my head. Or at least it feels like
>> that.
>>    Today when I look at the code I don't understand why is this
>> shuffle
>> relevant.
>>
>> The sysfs comment does not really apply to pmu, but also if I look
>> into
>> intel_gt_driver_(un)register I did not spot what is the relevant
>> part
>> which interacts with the PMU.
>>
>> On register it is engine list first then PMU.
>>
>> On unregister it is PMU first, then engine list:
>>
>>     i915_driver_remove
>>       i915_driver_unregister
>>         i915_pmu_unregister
>>       i915_gem_driver_remove
>>         clears uabi engines list
>>
>> Help please? :)
>>
>> Regards,
>>
>> Tvrtko
>>
>>>    	intel_display_driver_register(dev_priv);
>>>    
>>>    	intel_power_domains_enable(dev_priv);
>>> @@ -762,11 +762,12 @@ static void i915_driver_unregister(struct
>>> drm_i915_private *dev_priv)
>>>    
>>>    	intel_display_driver_unregister(dev_priv);
>>>    
>>> -	intel_gt_driver_unregister(to_gt(dev_priv));
>>> -
>>>    	i915_perf_unregister(dev_priv);
>>> +	/* GT should be available until PMU is gone */
>>>    	i915_pmu_unregister(dev_priv);
>>>    
>>> +	intel_gt_driver_unregister(to_gt(dev_priv));
>>> +
>>>    	i915_teardown_sysfs(dev_priv);
>>>    	drm_dev_unplug(&dev_priv->drm);
>>>    


More information about the Intel-gfx mailing list