[PATCH 00/11] fix memory leak while kset_register() fails

Yang Yingliang yangyingliang at huawei.com
Fri Oct 21 09:56:19 UTC 2022


On 2022/10/21 17:08, Luben Tuikov wrote:
> On 2022-10-21 04:59, Yang Yingliang wrote:
>> On 2022/10/21 16:36, Greg KH wrote:
>>> On Fri, Oct 21, 2022 at 04:24:23PM +0800, Yang Yingliang wrote:
>>>> On 2022/10/21 13:37, Greg KH wrote:
>>>>> On Fri, Oct 21, 2022 at 01:29:31AM -0400, Luben Tuikov wrote:
>>>>>> On 2022-10-20 22:20, Yang Yingliang wrote:
>>>>>>> The previous discussion link:
>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F0db486eb-6927-927e-3629-958f8f211194%40huawei.com%2FT%2F&data=05%7C01%7Cluben.tuikov%40amd.com%7C74aa9b57192b406ef27408dab3429db4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019395979868103%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=RcK05cXm1J5%2BtYcLO2SMG7k6sjeymQzdBzMCDJSzfdE%3D&reserved=0
>>>>>> The very first discussion on this was here:
>>>>>>
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fdri-devel%2Fmsg368077.html&data=05%7C01%7Cluben.tuikov%40amd.com%7C74aa9b57192b406ef27408dab3429db4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638019395979868103%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=sHZ6kfLF8HxrNXV6%2FVjgdH%2BmQM4T3Zv0U%2FAwddT97cE%3D&reserved=0
>>>>>>
>>>>>> Please use this link, and not the that one up there you which quoted above,
>>>>>> and whose commit description is taken verbatim from the this link.
>>>>>>
>>>>>>> kset_register() is currently used in some places without calling
>>>>>>> kset_put() in error path, because the callers think it should be
>>>>>>> kset internal thing to do, but the driver core can not know what
>>>>>>> caller doing with that memory at times. The memory could be freed
>>>>>>> both in kset_put() and error path of caller, if it is called in
>>>>>>> kset_register().
>>>>>> As I explained in the link above, the reason there's
>>>>>> a memory leak is that one cannot call kset_register() without
>>>>>> the kset->kobj.name being set--kobj_add_internal() returns -EINVAL,
>>>>>> in this case, i.e. kset_register() fails with -EINVAL.
>>>>>>
>>>>>> Thus, the most common usage is something like this:
>>>>>>
>>>>>> 	kobj_set_name(&kset->kobj, format, ...);
>>>>>> 	kset->kobj.kset = parent_kset;
>>>>>> 	kset->kobj.ktype = ktype;
>>>>>> 	res = kset_register(kset);
>>>>>>
>>>>>> So, what is being leaked, is the memory allocated in kobj_set_name(),
>>>>>> by the common idiom shown above. This needs to be mentioned in
>>>>>> the documentation, at least, in case, in the future this is absolved
>>>>>> in kset_register() redesign, etc.
>>>>> Based on this, can kset_register() just clean up from itself when an
>>>>> error happens?  Ideally that would be the case, as the odds of a kset
>>>>> being embedded in a larger structure is probably slim, but we would have
>>>>> to search the tree to make sure.
>>>> I have search the whole tree, the kset used in bus_register() - patch #3,
>>>> kset_create_and_add() - patch #4
>>>> __class_register() - patch #5,  fw_cfg_build_symlink() - patch #6 and
>>>> amdgpu_discovery.c - patch #10
>>>> is embedded in a larger structure. In these cases, we can not call
>>>> kset_put() in error path in kset_register()
>>> Yes you can as the kobject in the kset should NOT be controling the
>>> lifespan of those larger objects.
>>>
>>> If it is, please point out the call chain here as I don't think that
>>> should be possible.
>>>
>>> Note all of this is a mess because the kobject name stuff was added much
>>> later, after the driver model had been created and running for a while.
>>> We missed this error path when adding the dynamic kobject name logic,
>>> thank for looking into this.
>>>
>>> If you could test the patch posted with your error injection systems,
>>> that could make this all much simpler to solve.
>> The patch posted by Luben will cause double free in some cases.
> Yes, I figured this out in the other email and posted the scenario Greg
> was asking about.
>
> But I believe the question still stands if we can do kset_put()
> after a *failed* kset_register(), namely if more is being done than
> necessary, which is just to free the memory allocated by
> kobject_set_name().
The name memory is allocated in kobject_set_name() in caller,  and I 
think caller
free the memory that it allocated is reasonable, it's weird that some 
callers allocate
some memory and use function (kset_register) failed, then it free the 
memory allocated
in callers,  I think use kset_put()/kfree_const(name) in caller seems 
more reasonable.

Thanks,
Yang
>
> Regards,
> Luben
> .


More information about the amd-gfx mailing list