[Intel-gfx] [PATCH 6/8] drm/i915/gt: Fix memory leaks in per-gt sysfs
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Tue May 10 13:25:48 UTC 2022
On 10/05/2022 11:41, Andrzej Hajda wrote:
> On 10.05.2022 11:48, Tvrtko Ursulin wrote:
>> On 10/05/2022 10:39, Andrzej Hajda wrote:
>>> On 10.05.2022 10:18, Tvrtko Ursulin wrote:
>>>>
>>>> On 10/05/2022 08:58, Andrzej Hajda wrote:
>>>>> Hi Tvrtko,
>>>>>
>>>>> On 10.05.2022 09:28, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 29/04/2022 20:56, Ashutosh Dixit wrote:
>>>>>>> All kmalloc'd kobjects need a kobject_put() to free memory. For
>>>>>>> example in
>>>>>>> previous code, kobj_gt_release() never gets called. The
>>>>>>> requirement of
>>>>>>> kobject_put() now results in a slightly different code organization.
>>>>>>>
>>>>>>> v2: s/gtn/gt/ (Andi)
>>>>>>>
>>>>>>> Cc: Andi Shyti <andi.shyti at intel.com>
>>>>>>> Cc: Andrzej Hajda <andrzej.hajda at intel.com>
>>>>>>> Fixes: b770bcfae9ad ("drm/i915/gt: create per-tile sysfs interface")
>>>>>>> Signed-off-by: Ashutosh Dixit <ashutosh.dixit at intel.com>
>>>>>>> ---
>>>>>>> drivers/gpu/drm/i915/gt/intel_gt.c | 1 +
>>>>>>> drivers/gpu/drm/i915/gt/intel_gt_sysfs.c | 29
>>>>>>> ++++++++++--------------
>>>>>>> drivers/gpu/drm/i915/gt/intel_gt_sysfs.h | 6 +----
>>>>>>> drivers/gpu/drm/i915/gt/intel_gt_types.h | 3 +++
>>>>>>> drivers/gpu/drm/i915/i915_sysfs.c | 2 ++
>>>>>>> 5 files changed, 19 insertions(+), 22 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_gt.c
>>>>>>> index 92394f13b42f..9aede288eb86 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
>>>>>>> @@ -785,6 +785,7 @@ void intel_gt_driver_unregister(struct
>>>>>>> intel_gt *gt)
>>>>>>> {
>>>>>>> intel_wakeref_t wakeref;
>>>>>>> + intel_gt_sysfs_unregister(gt);
>>>>>>> intel_rps_driver_unregister(>->rps);
>>>>>>> intel_gsc_fini(>->gsc);
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
>>>>>>> index 8ec8bc660c8c..9e4ebf53379b 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
>>>>>>> @@ -24,7 +24,7 @@ bool is_object_gt(struct kobject *kobj)
>>>>>>> static struct intel_gt *kobj_to_gt(struct kobject *kobj)
>>>>>>> {
>>>>>>> - return container_of(kobj, struct kobj_gt, base)->gt;
>>>>>>> + return container_of(kobj, struct intel_gt, sysfs_gt);
>>>>>>> }
>>>>>>> struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
>>>>>>> @@ -72,9 +72,9 @@ static struct attribute *id_attrs[] = {
>>>>>>> };
>>>>>>> ATTRIBUTE_GROUPS(id);
>>>>>>> +/* A kobject needs a release() method even if it does nothing */
>>>>>>> static void kobj_gt_release(struct kobject *kobj)
>>>>>>> {
>>>>>>> - kfree(kobj);
>>>>>>> }
>>>>>>> static struct kobj_type kobj_gt_type = {
>>>>>>> @@ -85,8 +85,6 @@ static struct kobj_type kobj_gt_type = {
>>>>>>> void intel_gt_sysfs_register(struct intel_gt *gt)
>>>>>>> {
>>>>>>> - struct kobj_gt *kg;
>>>>>>> -
>>>>>>> /*
>>>>>>> * We need to make things right with the
>>>>>>> * ABI compatibility. The files were originally
>>>>>>> @@ -98,25 +96,22 @@ void intel_gt_sysfs_register(struct intel_gt
>>>>>>> *gt)
>>>>>>> if (gt_is_root(gt))
>>>>>>> intel_gt_sysfs_pm_init(gt, gt_get_parent_obj(gt));
>>>>>>> - kg = kzalloc(sizeof(*kg), GFP_KERNEL);
>>>>>>> - if (!kg)
>>>>>>> + /* init and xfer ownership to sysfs tree */
>>>>>>> + if (kobject_init_and_add(>->sysfs_gt, &kobj_gt_type,
>>>>>>> + gt->i915->sysfs_gt, "gt%d", gt->info.id))
>>>>>>
>>>>>> Was there closure/agreement on the matter of whether or not there
>>>>>> is a potential race between "kfree(gt)" and sysfs access (last put
>>>>>> from sysfs that is)? I've noticed Andrzej and Ashutosh were
>>>>>> discussing it but did not read all the details.
>>>>>>
>>>>>
>>>>> Not really :)
>>>>> IMO docs are against this practice, Ashutosh shows examples of this
>>>>> practice in code and according to his analysis it is safe.
>>>>> I gave up looking for contradictions :) Either it is OK, kobject is
>>>>> not fully shared object, docs are obsolete and needs update, either
>>>>> the patch is wrong.
>>>>> Anyway finally I tend to accept this solution, I failed to prove it
>>>>> is wrong :)
>>>>
>>>> Like a question of whether hotunplug can be triggered while
>>>> userspace is sitting in a sysfs hook? Final kfree then has to be
>>>> delayed until userspace exists.
>>>>
>>>> Btw where is the "kfree(gt)" for the tiles on the PCI remove path? I
>>>> can't find it.. Do we have a leak?
>>>
>>> intel_gt_tile_cleanup ?
>>
>> Called from intel_gt_release_all, whose only caller is the failure
>> path of i915_driver_probe. Feels like something is missing?
>
> This is final proof this patch is safe - no kfree, no UAF :)
>
> Apparently it is broken in internal branch as well.
> Should I take care of it?
Don't know - can you see with Andi?
I *think* even though the patch which added this code carries my name,
it is probably quite far from what I originally wrote. (I alluded to
that in a1a70e75-2068-fa69-e307-456d031b25b1 at linux.intel.com, maybe I
should have been more explicit that I don't think it should have
preserved my authorship.) At least I checked that my late 2019. version
and it did not seem to have the gt leak issue. If it did I would have
felt responsible to fix it. :) As it stands init/de-init paths are
always tricky and need more time to look into than I have at the moment.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list