[Intel-xe] [v2 1/2] drm/xe: Add a new memory directory under tile
Sundaresan, Sujaritha
sujaritha.sundaresan at intel.com
Thu Dec 7 09:55:32 UTC 2023
On 12/7/2023 2:00 PM, Upadhyay, Tejas wrote:
>
>> -----Original Message-----
>> From: Sundaresan, Sujaritha <sujaritha.sundaresan at intel.com>
>> Sent: Thursday, December 7, 2023 12:58 PM
>> To: Tauro, Riana <riana.tauro at intel.com>; Upadhyay, Tejas
>> <tejas.upadhyay at intel.com>; Gupta, Anshuman
>> <anshuman.gupta at intel.com>; intel-xe at lists.freedesktop.org
>> Cc: Vivi, Rodrigo <rodrigo.vivi at intel.com>
>> Subject: Re: [Intel-xe] [v2 1/2] drm/xe: Add a new memory directory under
>> tile
>>
>>
>> On 12/7/2023 12:08 PM, Sundaresan, Sujaritha wrote:
>>> On 12/7/2023 11:36 AM, Riana Tauro wrote:
>>>>
>>>> On 12/7/2023 10:51 AM, Sundaresan, Sujaritha wrote:
>>>>> On 12/7/2023 10:42 AM, Upadhyay, Tejas wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf
>>>>>>> Of Sundaresan, Sujaritha
>>>>>>> Sent: Wednesday, December 6, 2023 5:44 PM
>>>>>>> To: Gupta, Anshuman <anshuman.gupta at intel.com>; intel-
>>>>>>> xe at lists.freedesktop.org
>>>>>>> Cc: Vivi, Rodrigo <rodrigo.vivi at intel.com>
>>>>>>> Subject: Re: [Intel-xe] [v2 1/2] drm/xe: Add a new memory
>>>>>>> directory under tile
>>>>>>>
>>>>>>>
>>>>>>> On 12/6/2023 5:38 PM, Sundaresan, Sujaritha wrote:
>>>>>>>> On 12/6/2023 5:23 PM, Gupta, Anshuman wrote:
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On
>>>>>>>>>> Behalf Of Sujaritha Sundaresan
>>>>>>>>>> Sent: Wednesday, December 6, 2023 10:18 AM
>>>>>>>>>> To: intel-xe at lists.freedesktop.org
>>>>>>>>>> Cc: Sundaresan, Sujaritha <sujaritha.sundaresan at intel.com>;
>>>>>>>>>> Vivi, Rodrigo <rodrigo.vivi at intel.com>
>>>>>>>>>> Subject: [Intel-xe] [v2 1/2] drm/xe: Add a new memory directory
>>>>>>>>>> under tile
>>>>>>>>>>
>>>>>>>>>> Add a new memory directory under /device/tile<n> and move
>>>>>>>>>> physical_vram_size attribute to the new directory.
>>>>>>>>>>
>>>>>>>>>> New hierarchy:
>>>>>>>>>>
>>>>>>>>>> /device/tile<n>/memory/physical_vram_size_bytes
>>>>>>>>>>
>>>>>>>>>> v2: Fix heading typo (Riana)
>>>>>>>>>> Fix cleanup error on unload/reload cycle
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Sujaritha Sundaresan
>>>>>>>>>> <sujaritha.sundaresan at intel.com>
>>>>>>>>>> ---
>>>>>>>>>> drivers/gpu/drm/xe/xe_tile_sysfs.c | 15 ++++++++++++---
>>>>>>>>>> 1 file changed, 12 insertions(+), 3 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_tile_sysfs.c
>>>>>>>>>> b/drivers/gpu/drm/xe/xe_tile_sysfs.c
>>>>>>>>>> index 16376607c68f..e8ce4d9270e6 100644
>>>>>>>>>> --- a/drivers/gpu/drm/xe/xe_tile_sysfs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_tile_sysfs.c
>>>>>>>>>> @@ -24,7 +24,8 @@ static ssize_t
>>>>>>>>>> physical_vram_size_bytes_show(struct device *kdev, struct
>>>>>>>>>> device_attribute *attr,
>>>>>>>>>> char *buf)
>>>>>>>>>> {
>>>>>>>>>> - struct xe_tile *tile = kobj_to_tile(&kdev->kobj);
>>>>>>>>>> + struct kobject *kobj = &kdev->kobj;
>>>>>>>>>> + struct xe_tile *tile = kobj_to_tile(kobj->parent);
>>>>>>>>>>
>>>>>>>>>> return sysfs_emit(buf, "%llu\n",
>>>>>>>>>> tile->mem.vram.actual_physical_size);
>>>>>>>>>> }
>>>>>>>>>> @@ -38,7 +39,7 @@ static void tile_sysfs_fini(struct drm_device
>>>>>>>>>> *drm, void
>>>>>>>>>> *arg) {
>>>>>>>>>> struct xe_tile *tile = arg;
>>>>>>>>>>
>>>>>>>>>> - kobject_put(tile->sysfs);
>>>>>>>>>> + kobject_del(tile->sysfs);
>>>>>>>>> Why kobekct_del instead of kobject_put?
>>>>>>>>> Thanks,
>>>>>>>>> Anshuman Gupta.
>>>>>>>> Hi Anshuman,
>>>>>>>>
>>>>>>>> Basically when sanity checking, after reload we see that we are
>>>>>>>> not doing a proper cleanup.
>>>>>>>>
>>>>>>>> kobject_put will only decrement the ref count and possibly free
>>>>>>>> the kobject.
>>>>>>>>
>>>>>>>> But that is not happening in this case. There is a duplicate
>>>>>>>> remaining of the tile directory.
>>>>>>>>
>>>>>>>> This required a clean unregister of the parent from sysfs hence
>>>>>>>> the use of kobject_del.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Suja
>>>>>>> As a continuation of the above response;
>>>>>>>
>>>>>>> I can probably add a kobject_put call as well to ensure that we
>>>>>>> are cleaning up the memory side of
>>>>>>>
>>>>>>> things as well. Will add.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Suja
>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> void xe_tile_sysfs_init(struct xe_tile *tile) @@ -46,6 +47,7
>>>>>>>>>> @@ void xe_tile_sysfs_init(struct xe_tile *tile)
>>>>>>>>>> struct xe_device *xe = tile_to_xe(tile);
>>>>>>>>>> struct device *dev = xe->drm.dev;
>>>>>>>>>> struct kobj_tile *kt;
>>>>>>>>>> + struct kobject *kobj;
>>>>>>>>>> int err;
>>>>>>>>>>
>>>>>>>>>> kt = kzalloc(sizeof(*kt), GFP_KERNEL); @@ -64,8 +66,15
>>>>>>>>>> @@ void xe_tile_sysfs_init(struct xe_tile *tile)
>>>>>>>>>>
>>>>>>>>>> tile->sysfs = &kt->base;
>>>>>>>>>>
>>>>>>>>>> + kobj = kobject_create_and_add("memory", tile->sysfs);
>>>>>>>>>> + if (!kobj) {
>>>>>>>>>> + kobject_put(kobj);
>>>>>> Do you mean to put kobject_put(tile->sysfs) instead of
>>>>>> kobject_put(kobj) ? as there was no Kobj created by the time you
>>>>>> reached here!
>>>>>>
>>>>>> Tejas
>>>>> Yup this should be fixed.
>>>> Hi Suja
>>>>
>>>> Removing tile won't be right, as there are other directories (gt#)
>>>> dependent on it. Simple return should be good with a warn?
>>>>
>>>> Thanks
>>>> Riana
>>> Sure. We can probably have the original cleanup in fini.
> If you just give warn and return then you will never register tile_fini function and it will never be called on driver unload/reload. So either you remove tile using kobject_put() before return or don’t return and check kobj before creating files under it to avoid crash.
>
> Tejas
Let me test out everything and see which approach will work the best.
Thanks,
Suja
>
>>>>> Thanks.
>>>>>
>>>>> Suja
>>>>>
>>>>>>>>>> + drm_warn(&xe->drm, "%s failed, err: %d\n", __func__, -
>>>>>>>>>> ENOMEM);
>>>>>>>>>> + return;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> if (IS_DGFX(xe) && xe->info.platform != XE_DG1 &&
>>>>>>>>>> - sysfs_create_file(tile->sysfs, physical_memsize_attr))
>>>>>>>>>> + sysfs_create_file(kobj, physical_memsize_attr))
>>>>>>>>>> drm_warn(&xe->drm,
>>>>>>>>>> "Sysfs creation to read addr_range per tile
>>>>>>>>>> failed\n");
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> 2.25.1
>> Hi all,
>>
>> So after all of this discussion, here's the final results after testing the cleanup.
>>
>> Regardless of anything we do in the init function, it looks like without having
>> the
>>
>> two-step kobject_del and kobject_put cleanup in the sysfs_fini function we
>> will
>>
>> see an error on reload with the tile directory not being fully cleaned up and
>> reporting
>>
>> duplicate creation.
>>
>> The solution seems to be just to add a kobject_del before kobject_put in the
>> fini
>>
>> function.
>>
>> I am only having warnings and simple returns on the init side.
>>
>> If there can be a consensus about this from the reviewers, I can float the next
>> version
>>
>> accordingly.
>>
>> Thanks,
>>
>> Suja
More information about the Intel-xe
mailing list