[Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

Andrzej Hajda andrzej.hajda at intel.com
Mon May 2 06:22:02 UTC 2022



On 29.04.2022 06:25, Dixit, Ashutosh wrote:
> On Thu, 28 Apr 2022 07:36:14 -0700, Andrzej Hajda wrote:
>> On 27.04.2022 22:46, Dixit, Ashutosh wrote:
>>> On Sun, 24 Apr 2022 15:36:23 -0700, Andi Shyti wrote:
>>>> Hi Andrzej and Ashutosh,
>>>>
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>>>>>>> index 937b2e1a305e..4c72b4f983a6 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
>>>>>>>> @@ -222,6 +222,9 @@ struct intel_gt {
>>>>>>>> 	} mocs;
>>>>>>>> 		struct intel_pxp pxp;
>>>>>>>> +
>>>>>>>> +	/* gt/gtN sysfs */
>>>>>>>> +	struct kobject sysfs_gtn;
>>>>>>> If you put kobject as a part of intel_gt what assures you that lifetime of
>>>>>>> kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
>>>>>>> intel_gt?
>>>>>> Because we are explicitly doing a kobject_put() in
>>>>>> intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
>>>>>> the previous code.
>>>>>>
>>>>>> Let me explain a bit about the previous code (but feel free to skip since
>>>>>> the patch should speak for itself):
>>>>>> * Previously we kzalloc a 'struct kobj_gt'
>>>>>> * But we don't save a pointer to the 'struct kobj_gt' so we don't have the
>>>>>>      pointer to the kobject to be able to do a kobject_put() on it later
>>>>>> * Therefore we need to store the pointer in 'struct intel_gt'
>>>>>> * But if we have to put the pointer in 'struct intel_gt' we might as well
>>>>>>      put the kobject as part of 'struct intel_gt' and that also removes the
>>>>>>      need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of()
>>>>>>      to get gt from kobj).
>>>>>> * So I think this patch simpler/cleaner than the original code if you take
>>>>>>      the requirement for kobject_put() into account.
>>>> This is my oversight. This was something I completely forgot to
>>>> fix but it was my intention to do and actually I had some fixes
>>>> ongoing. But because this patch took too long to get in I
>>>> completely forgot about it (Sujaritha was actually the first who
>>>> pointed this out).
>>>>
>>>> Thanks, Ashutosh for taking this.
>>>>
>>>>> I fully agree that previous code is incorrect but I am not convinced current
>>>>> code is correct.
>>>>> If some objects are kref-counted it means usually they can have multiple
>>>>> concurrent users and kobject_put does not work as traditional
>>>>> destructor/cleanup/unregister.
>>>>> So in this particular case after calling kobject_init_and_add sysfs core can
>>>>> get multiple references on the object. Later, during driver unregistration
>>>>> kobject_put is called, but if the object is still in use by sysfs core, the
>>>>> object will not be destroyed/released. If the driver unregistration
>>>>> continues memory will be freed, leaving sysfs-core (or other users) with
>>>>> dangling pointers. Unless there is some additional synchronization mechanism
>>>>> I am not aware of.
>>>> Thanks Andrzej for summarizing this and what you said is actually
>>>> what happens. I had a similar solution developed and I had wrong
>>>> pointer reference happening.
>>> Hi Andrzej/Andi,
>>>
>>> I did do some research into kobject's and such before writing this patch
>>> and based on that I believe the patch is correct. Presenting some evidence
>>> below.
>>>
>>> The patch is verified by:
>>>
>>> a. Putting a printk in the release() method when it exists (it does for
>>>      sysfs_gtn kobject)
>>> b. Enabling dynamic prints for lib/kobject.c
>>>
>>> For example, with the following:
>>>
>>> # echo 'file kobject.c +p' > /sys/kernel/debug/dynamic_debug/control
>>> # echo -n "0000:03:00.0" > /sys/bus/pci/drivers/i915/unbind
>>>
>>> We see this in dmesg (see kobject_cleanup() called from kobject_put()):
>>>
>>> [ 1034.930007] kobject: '.defaults' (ffff88817130a640): kobject_cleanup, parent ffff8882262b5778
>>> [ 1034.930020] kobject: '.defaults' (ffff88817130a640): auto cleanup kobject_del
>>> [ 1034.930336] kobject: '.defaults' (ffff88817130a640): calling ktype release
>>> [ 1034.930340] kobject: (ffff88817130a640): dynamic_kobj_release
>>> [ 1034.930354] kobject: '.defaults': free name
>>> [ 1034.930366] kobject: 'gt0' (ffff8882262b5778): kobject_cleanup, parent ffff88817130a240
>>> [ 1034.930371] kobject: 'gt0' (ffff8882262b5778): auto cleanup kobject_del
>>> [ 1034.931930] kobject: 'gt0' (ffff8882262b5778): calling ktype release
>>> [ 1034.931936] kobject: 'gt0': free name
>>> [ 1034.958004] kobject: 'i915_0000_03_00.0' (ffff88810e1f8800): fill_kobj_path: path = '/devices/i915_0000_03_00.0'
>>> [ 1034.958155] kobject: 'i915_0000_03_00.0' (ffff88810e1f8800): kobject_cleanup, parent 0000000000000000
>>> [ 1034.958162] kobject: 'i915_0000_03_00.0' (ffff88810e1f8800): calling ktype release
>>> [ 1034.958188] kobject: 'i915_0000_03_00.0': free name
>>> [ 1034.958729] kobject: 'gt' (ffff88817130a240): kobject_cleanup, parent ffff8881160c5000
>>> [ 1034.958736] kobject: 'gt' (ffff88817130a240): auto cleanup kobject_del
>>> [ 1034.958762] kobject: 'gt' (ffff88817130a240): calling ktype release
>>> [ 1034.958767] kobject: (ffff88817130a240): dynamic_kobj_release
>>> [ 1034.958778] kobject: 'gt': free name
>>>
>>> We have the following directory structure (one of the patches is creating
>>> /sys/class/drm/card0/gt/gt0/.defaults):
>>>
>>>         /sys/class/drm/card0/gt
>>>                              |-gt0
>>>                                 |-.defaults
>>>
>>> And we see from dmesg .defaults, gt0 and gt kobjects being cleaned up in
>>> that order.
>>>
>>> Looking at lib/kobject.c there are several interesting things:
>>>
>>> * Three subsystems are involved: kobject, sysfs and kernfs.
>>>
>>> * A child kobject takes a reference on the parent, so we must do a
>>>     kobject_put() on the child before doing kobject_put() on the parent
>>>     (creating a child kobject creates a corresponding sub-directory in sysfs).
>>>
>>> * Adding files to a sysfs directory does not take a reference on the
>>>     kobject, only on the parent kernfs_node.
>>>
>>> * Since we do call sysfs_create_group() (for RC6) ordinarily we will need
>>>     to call sysfs_remove_group() but this does not seem to be needed because
>>>     we are not creating a directory for the group (by providing a name for
>>>     the group). So sysfs_create_group() is equivalent to sysfs_create_files().
>>>     So it seems we don't need sysfs_remove_group().
>>>
>>> * Similarly it appears files created by sysfs_create_files() do not need to
>>>     be removed by sysfs_remove_files() because __kobject_del() and
>>>     sysfs_remove_dir() called from kobject_cleanup() do that for us (the
>>>     comment in kobject_cleanup() says "remove from sysfs if the caller did
>>>     not do it").
>>>
>>> Based on the above it is clear that no one except a child kobject takes a
>>> reference on the parent kobject and as long as we kobject_put() them in the
>>> correct order (as we seem to be doing based on dmesg trace above) we should
>>> be ok.
>>>
>>> Also what is followed in this patch is a fairly standard coding
>>> pattern. Further, in case of any errors we generally see failure to unload
>>> the module etc. and none of these things are being observed, module reload
>>> works fine.
>>>
>>> I hope these points are helpful in completing review of the patch.
>> See [1], it is quite old, so maybe it is not valid anymore, but I see no
>> code proving sth has changed.
> Hi Andrzej,
>
> A lot has changed since that article from 2003 (for 2.5 kernel). For
> instance there is kernfs (as I mention above):
>
> 	https://lwn.net/Articles/571590/
>
> A process having a sysfs file open today in my view will result in the
> following:
> * It will take a reference on kernfs_node (not on kobject as was the case
>    in kernel 2.5 in [1])
> * An open file will prevent the module from being unloaded (not the kernel
>    crashing as in 2.5 in [1])

Thats nice, but kernfs_node->priv still points to kobject so their 
lifetimes are bounded.


>
> So this is what I would expect with today's kernel. I am not seeing
> anything we've done here which violates anything in [1] or [2].
>> Also current doc says also [2] similar things, especially:
>> "Once you registered your kobject via kobject_add(), you must never use
>> kfree() to free it directly"
> Correct, we are using kobject_put(), not kfree'ing the kobject.

That I wouldn't agree. kobject_put is called, then the object in which kobject is embedded is kfree'd somewhere later on driver removal, without awareness of this kobject.
According to your analysis it should have 0 refs, but this is analysis of the current code, even if it is true now it could change in the future.
And IMO it is against docs[2]:
- "One important point cannot be overstated: every kobject must have a
release() method, and the kobject must persist (in a consistent state)
until that method is called. If these constraints are not met, the code is
flawed." - empty release method means clearly it is against the docs.
-"The end result is that a structure protected by a kobject cannot be freed
before its reference count goes to zero. The reference count is not under
the direct control of the code which created the kobject.".

So either docs and part of kobject code were not updated to reflect changes you are assuming, either your assumption is incorrect.
Looking at other users of kobject it seems they follow docs, their release method either frees memory directly either kref_put on containing struct, it was just quick scan so I could overlooked sth.


Regards
Andrzej


>
> Thanks.
> --
> Ashutosh
>
>> [1]: https://lwn.net/Articles/36850/
>> [2]: https://elixir.bootlin.com/linux/v5.18-rc4/source/Documentation/core-api/kobject.rst#L246



More information about the Intel-gfx mailing list