[Intel-xe] [v2 1/2] drm/xe: Add a new memory directory under tile

Sundaresan, Sujaritha sujaritha.sundaresan at intel.com
Thu Dec 7 13:04:07 UTC 2023


On 12/7/2023 3:25 PM, Sundaresan, Sujaritha wrote:
>
> On 12/7/2023 2:00 PM, Upadhyay, Tejas wrote:
>>
>>> -----Original Message-----
>>> From: Sundaresan, Sujaritha <sujaritha.sundaresan at intel.com>
>>> Sent: Thursday, December 7, 2023 12:58 PM
>>> To: Tauro, Riana <riana.tauro at intel.com>; Upadhyay, Tejas
>>> <tejas.upadhyay at intel.com>; Gupta, Anshuman
>>> <anshuman.gupta at intel.com>; intel-xe at lists.freedesktop.org
>>> Cc: Vivi, Rodrigo <rodrigo.vivi at intel.com>
>>> Subject: Re: [Intel-xe] [v2 1/2] drm/xe: Add a new memory directory 
>>> under
>>> tile
>>>
>>>
>>> On 12/7/2023 12:08 PM, Sundaresan, Sujaritha wrote:
>>>> On 12/7/2023 11:36 AM, Riana Tauro wrote:
>>>>>
>>>>> On 12/7/2023 10:51 AM, Sundaresan, Sujaritha wrote:
>>>>>> On 12/7/2023 10:42 AM, Upadhyay, Tejas wrote:
>>>>>>>> -----Original Message-----
>>>>>>>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf
>>>>>>>> Of Sundaresan, Sujaritha
>>>>>>>> Sent: Wednesday, December 6, 2023 5:44 PM
>>>>>>>> To: Gupta, Anshuman <anshuman.gupta at intel.com>; intel-
>>>>>>>> xe at lists.freedesktop.org
>>>>>>>> Cc: Vivi, Rodrigo <rodrigo.vivi at intel.com>
>>>>>>>> Subject: Re: [Intel-xe] [v2 1/2] drm/xe: Add a new memory
>>>>>>>> directory under tile
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/6/2023 5:38 PM, Sundaresan, Sujaritha wrote:
>>>>>>>>> On 12/6/2023 5:23 PM, Gupta, Anshuman wrote:
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On
>>>>>>>>>>> Behalf Of Sujaritha Sundaresan
>>>>>>>>>>> Sent: Wednesday, December 6, 2023 10:18 AM
>>>>>>>>>>> To: intel-xe at lists.freedesktop.org
>>>>>>>>>>> Cc: Sundaresan, Sujaritha <sujaritha.sundaresan at intel.com>;
>>>>>>>>>>> Vivi, Rodrigo <rodrigo.vivi at intel.com>
>>>>>>>>>>> Subject: [Intel-xe] [v2 1/2] drm/xe: Add a new memory directory
>>>>>>>>>>> under tile
>>>>>>>>>>>
>>>>>>>>>>> Add a new memory directory under /device/tile<n> and move
>>>>>>>>>>> physical_vram_size attribute to the new directory.
>>>>>>>>>>>
>>>>>>>>>>> New hierarchy:
>>>>>>>>>>>
>>>>>>>>>>> /device/tile<n>/memory/physical_vram_size_bytes
>>>>>>>>>>>
>>>>>>>>>>> v2: Fix heading typo (Riana)
>>>>>>>>>>>        Fix cleanup error on unload/reload cycle
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Sujaritha Sundaresan
>>>>>>>>>>> <sujaritha.sundaresan at intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>>     drivers/gpu/drm/xe/xe_tile_sysfs.c | 15 ++++++++++++---
>>>>>>>>>>>     1 file changed, 12 insertions(+), 3 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_tile_sysfs.c
>>>>>>>>>>> b/drivers/gpu/drm/xe/xe_tile_sysfs.c
>>>>>>>>>>> index 16376607c68f..e8ce4d9270e6 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/xe/xe_tile_sysfs.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_tile_sysfs.c
>>>>>>>>>>> @@ -24,7 +24,8 @@ static ssize_t
>>>>>>>>>>>     physical_vram_size_bytes_show(struct device *kdev, struct
>>>>>>>>>>> device_attribute *attr,
>>>>>>>>>>>                       char *buf)
>>>>>>>>>>>     {
>>>>>>>>>>> -    struct xe_tile *tile = kobj_to_tile(&kdev->kobj);
>>>>>>>>>>> +    struct kobject *kobj = &kdev->kobj;
>>>>>>>>>>> +    struct xe_tile *tile = kobj_to_tile(kobj->parent);
>>>>>>>>>>>
>>>>>>>>>>>         return sysfs_emit(buf, "%llu\n",
>>>>>>>>>>> tile->mem.vram.actual_physical_size);
>>>>>>>>>>>     }
>>>>>>>>>>> @@ -38,7 +39,7 @@ static void tile_sysfs_fini(struct drm_device
>>>>>>>>>>> *drm, void
>>>>>>>>>>> *arg)  {
>>>>>>>>>>>         struct xe_tile *tile = arg;
>>>>>>>>>>>
>>>>>>>>>>> -    kobject_put(tile->sysfs);
>>>>>>>>>>> +    kobject_del(tile->sysfs);
>>>>>>>>>> Why kobekct_del instead of kobject_put?
>>>>>>>>>> Thanks,
>>>>>>>>>> Anshuman Gupta.
>>>>>>>>> Hi Anshuman,
>>>>>>>>>
>>>>>>>>> Basically when sanity checking, after reload we see that we are
>>>>>>>>> not doing a proper cleanup.
>>>>>>>>>
>>>>>>>>> kobject_put will only decrement the ref count and possibly free
>>>>>>>>> the kobject.
>>>>>>>>>
>>>>>>>>> But that is not happening in this case. There is a duplicate
>>>>>>>>> remaining of the tile directory.
>>>>>>>>>
>>>>>>>>> This required a clean unregister of the parent from sysfs hence
>>>>>>>>> the use of kobject_del.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Suja
>>>>>>>> As a continuation of the above response;
>>>>>>>>
>>>>>>>> I can probably add a kobject_put call as well to ensure that we
>>>>>>>> are cleaning up the memory side of
>>>>>>>>
>>>>>>>> things as well. Will add.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Suja
>>>>>>>>
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>>     void xe_tile_sysfs_init(struct xe_tile *tile) @@ -46,6 
>>>>>>>>>>> +47,7
>>>>>>>>>>> @@ void xe_tile_sysfs_init(struct xe_tile *tile)
>>>>>>>>>>>         struct xe_device *xe = tile_to_xe(tile);
>>>>>>>>>>>         struct device *dev = xe->drm.dev;
>>>>>>>>>>>         struct kobj_tile *kt;
>>>>>>>>>>> +    struct kobject *kobj;
>>>>>>>>>>>         int err;
>>>>>>>>>>>
>>>>>>>>>>>         kt = kzalloc(sizeof(*kt), GFP_KERNEL); @@ -64,8 +66,15
>>>>>>>>>>> @@ void xe_tile_sysfs_init(struct xe_tile *tile)
>>>>>>>>>>>
>>>>>>>>>>>         tile->sysfs = &kt->base;
>>>>>>>>>>>
>>>>>>>>>>> +    kobj = kobject_create_and_add("memory", tile->sysfs);
>>>>>>>>>>> +    if (!kobj) {
>>>>>>>>>>> +        kobject_put(kobj);
>>>>>>> Do you mean to put kobject_put(tile->sysfs) instead of
>>>>>>> kobject_put(kobj) ? as there was no Kobj created by the time you
>>>>>>> reached here!
>>>>>>>
>>>>>>> Tejas
>>>>>> Yup this should be fixed.
>>>>> Hi Suja
>>>>>
>>>>> Removing tile won't be right, as there are other directories (gt#)
>>>>> dependent on it. Simple return should be good with a warn?
>>>>>
>>>>> Thanks
>>>>> Riana
>>>> Sure. We can probably have the original cleanup in fini.
>> If you just give warn and return then you will never register 
>> tile_fini function and it will never be called on driver 
>> unload/reload. So either you remove tile using kobject_put() before 
>> return or don’t return and check kobj before creating files under it 
>> to avoid crash.
>>
>> Tejas
>
> Let me test out everything and see which approach will work the best.
>
> Thanks,
>
> Suja

After a round of debug with Tejas, was able to fix the cleanup cycle. 
Will send out v3 asap.

Thanks,

Suja

>
>>>>>> Thanks.
>>>>>>
>>>>>> Suja
>>>>>>
>>>>>>>>>>> + drm_warn(&xe->drm, "%s failed, err: %d\n", __func__, -
>>>>>>>>>>> ENOMEM);
>>>>>>>>>>> +        return;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>>         if (IS_DGFX(xe) && xe->info.platform != XE_DG1 &&
>>>>>>>>>>> -        sysfs_create_file(tile->sysfs, physical_memsize_attr))
>>>>>>>>>>> +        sysfs_create_file(kobj, physical_memsize_attr))
>>>>>>>>>>>             drm_warn(&xe->drm,
>>>>>>>>>>>                  "Sysfs creation to read addr_range per tile
>>>>>>>>>>> failed\n");
>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>>> 2.25.1
>>> Hi all,
>>>
>>> So after all of this discussion, here's the final results after 
>>> testing the cleanup.
>>>
>>> Regardless of anything we do in the init function, it looks like 
>>> without having
>>> the
>>>
>>> two-step kobject_del and kobject_put cleanup in the sysfs_fini 
>>> function we
>>> will
>>>
>>> see an error on reload with the tile directory not being fully 
>>> cleaned up and
>>> reporting
>>>
>>> duplicate creation.
>>>
>>> The solution seems to be just to add a kobject_del before 
>>> kobject_put in the
>>> fini
>>>
>>> function.
>>>
>>> I am only having warnings and simple returns on the init side.
>>>
>>> If there can be a consensus about this from the reviewers, I can 
>>> float the next
>>> version
>>>
>>> accordingly.
>>>
>>> Thanks,
>>>
>>> Suja


More information about the Intel-xe mailing list