[Intel-xe] [v2 1/2] drm/xe: Add a new memory directory under tile

Upadhyay, Tejas tejas.upadhyay at intel.com
Thu Dec 7 08:30:17 UTC 2023



> -----Original Message-----
> From: Sundaresan, Sujaritha <sujaritha.sundaresan at intel.com>
> Sent: Thursday, December 7, 2023 12:58 PM
> To: Tauro, Riana <riana.tauro at intel.com>; Upadhyay, Tejas
> <tejas.upadhyay at intel.com>; Gupta, Anshuman
> <anshuman.gupta at intel.com>; intel-xe at lists.freedesktop.org
> Cc: Vivi, Rodrigo <rodrigo.vivi at intel.com>
> Subject: Re: [Intel-xe] [v2 1/2] drm/xe: Add a new memory directory under
> tile
> 
> 
> On 12/7/2023 12:08 PM, Sundaresan, Sujaritha wrote:
> >
> > On 12/7/2023 11:36 AM, Riana Tauro wrote:
> >>
> >>
> >> On 12/7/2023 10:51 AM, Sundaresan, Sujaritha wrote:
> >>>
> >>> On 12/7/2023 10:42 AM, Upadhyay, Tejas wrote:
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf
> >>>>> Of Sundaresan, Sujaritha
> >>>>> Sent: Wednesday, December 6, 2023 5:44 PM
> >>>>> To: Gupta, Anshuman <anshuman.gupta at intel.com>; intel-
> >>>>> xe at lists.freedesktop.org
> >>>>> Cc: Vivi, Rodrigo <rodrigo.vivi at intel.com>
> >>>>> Subject: Re: [Intel-xe] [v2 1/2] drm/xe: Add a new memory
> >>>>> directory under tile
> >>>>>
> >>>>>
> >>>>> On 12/6/2023 5:38 PM, Sundaresan, Sujaritha wrote:
> >>>>>> On 12/6/2023 5:23 PM, Gupta, Anshuman wrote:
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On
> >>>>>>>> Behalf Of Sujaritha Sundaresan
> >>>>>>>> Sent: Wednesday, December 6, 2023 10:18 AM
> >>>>>>>> To: intel-xe at lists.freedesktop.org
> >>>>>>>> Cc: Sundaresan, Sujaritha <sujaritha.sundaresan at intel.com>;
> >>>>>>>> Vivi, Rodrigo <rodrigo.vivi at intel.com>
> >>>>>>>> Subject: [Intel-xe] [v2 1/2] drm/xe: Add a new memory directory
> >>>>>>>> under tile
> >>>>>>>>
> >>>>>>>> Add a new memory directory under /device/tile<n> and move
> >>>>>>>> physical_vram_size attribute to the new directory.
> >>>>>>>>
> >>>>>>>> New hierarchy:
> >>>>>>>>
> >>>>>>>> /device/tile<n>/memory/physical_vram_size_bytes
> >>>>>>>>
> >>>>>>>> v2: Fix heading typo (Riana)
> >>>>>>>>       Fix cleanup error on unload/reload cycle
> >>>>>>>>
> >>>>>>>> Signed-off-by: Sujaritha Sundaresan
> >>>>>>>> <sujaritha.sundaresan at intel.com>
> >>>>>>>> ---
> >>>>>>>>    drivers/gpu/drm/xe/xe_tile_sysfs.c | 15 ++++++++++++---
> >>>>>>>>    1 file changed, 12 insertions(+), 3 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_tile_sysfs.c
> >>>>>>>> b/drivers/gpu/drm/xe/xe_tile_sysfs.c
> >>>>>>>> index 16376607c68f..e8ce4d9270e6 100644
> >>>>>>>> --- a/drivers/gpu/drm/xe/xe_tile_sysfs.c
> >>>>>>>> +++ b/drivers/gpu/drm/xe/xe_tile_sysfs.c
> >>>>>>>> @@ -24,7 +24,8 @@ static ssize_t
> >>>>>>>>    physical_vram_size_bytes_show(struct device *kdev, struct
> >>>>>>>> device_attribute *attr,
> >>>>>>>>                      char *buf)
> >>>>>>>>    {
> >>>>>>>> -    struct xe_tile *tile = kobj_to_tile(&kdev->kobj);
> >>>>>>>> +    struct kobject *kobj = &kdev->kobj;
> >>>>>>>> +    struct xe_tile *tile = kobj_to_tile(kobj->parent);
> >>>>>>>>
> >>>>>>>>        return sysfs_emit(buf, "%llu\n",
> >>>>>>>> tile->mem.vram.actual_physical_size);
> >>>>>>>>    }
> >>>>>>>> @@ -38,7 +39,7 @@ static void tile_sysfs_fini(struct drm_device
> >>>>>>>> *drm, void
> >>>>>>>> *arg)  {
> >>>>>>>>        struct xe_tile *tile = arg;
> >>>>>>>>
> >>>>>>>> -    kobject_put(tile->sysfs);
> >>>>>>>> +    kobject_del(tile->sysfs);
> >>>>>>> Why kobekct_del instead of kobject_put?
> >>>>>>> Thanks,
> >>>>>>> Anshuman Gupta.
> >>>>>> Hi Anshuman,
> >>>>>>
> >>>>>> Basically when sanity checking, after reload we see that we are
> >>>>>> not doing a proper cleanup.
> >>>>>>
> >>>>>> kobject_put will only decrement the ref count and possibly free
> >>>>>> the kobject.
> >>>>>>
> >>>>>> But that is not happening in this case. There is a duplicate
> >>>>>> remaining of the tile directory.
> >>>>>>
> >>>>>> This required a clean unregister of the parent from sysfs hence
> >>>>>> the use of kobject_del.
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Suja
> >>>>> As a continuation of the above response;
> >>>>>
> >>>>> I can probably add a kobject_put call as well to ensure that we
> >>>>> are cleaning up the memory side of
> >>>>>
> >>>>> things as well. Will add.
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Suja
> >>>>>
> >>>>>>>>    }
> >>>>>>>>
> >>>>>>>>    void xe_tile_sysfs_init(struct xe_tile *tile) @@ -46,6 +47,7
> >>>>>>>> @@ void xe_tile_sysfs_init(struct xe_tile *tile)
> >>>>>>>>        struct xe_device *xe = tile_to_xe(tile);
> >>>>>>>>        struct device *dev = xe->drm.dev;
> >>>>>>>>        struct kobj_tile *kt;
> >>>>>>>> +    struct kobject *kobj;
> >>>>>>>>        int err;
> >>>>>>>>
> >>>>>>>>        kt = kzalloc(sizeof(*kt), GFP_KERNEL); @@ -64,8 +66,15
> >>>>>>>> @@ void xe_tile_sysfs_init(struct xe_tile *tile)
> >>>>>>>>
> >>>>>>>>        tile->sysfs = &kt->base;
> >>>>>>>>
> >>>>>>>> +    kobj = kobject_create_and_add("memory", tile->sysfs);
> >>>>>>>> +    if (!kobj) {
> >>>>>>>> +        kobject_put(kobj);
> >>>> Do you mean to put kobject_put(tile->sysfs) instead of
> >>>> kobject_put(kobj) ? as there was no Kobj created by the time you
> >>>> reached here!
> >>>>
> >>>> Tejas
> >>>
> >>> Yup this should be fixed.
> >> Hi Suja
> >>
> >> Removing tile won't be right, as there are other directories (gt#)
> >> dependent on it. Simple return should be good with a warn?
> >>
> >> Thanks
> >> Riana
> > Sure. We can probably have the original cleanup in fini.

If you just give warn and return then you will never register tile_fini function and it will never be called on driver unload/reload. So either you remove tile using kobject_put() before return or don’t return and check kobj before creating files under it to avoid crash.

Tejas
 
> >>>
> >>> Thanks.
> >>>
> >>> Suja
> >>>
> >>>>
> >>>>>>>> + drm_warn(&xe->drm, "%s failed, err: %d\n", __func__, -
> >>>>>>>> ENOMEM);
> >>>>>>>> +        return;
> >>>>>>>> +    }
> >>>>>>>> +
> >>>>>>>>        if (IS_DGFX(xe) && xe->info.platform != XE_DG1 &&
> >>>>>>>> -        sysfs_create_file(tile->sysfs, physical_memsize_attr))
> >>>>>>>> +        sysfs_create_file(kobj, physical_memsize_attr))
> >>>>>>>>            drm_warn(&xe->drm,
> >>>>>>>>                 "Sysfs creation to read addr_range per tile
> >>>>>>>> failed\n");
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> 2.25.1
> 
> Hi all,
> 
> So after all of this discussion, here's the final results after testing the cleanup.
> 
> Regardless of anything we do in the init function, it looks like without having
> the
> 
> two-step kobject_del and kobject_put cleanup in the sysfs_fini function we
> will
> 
> see an error on reload with the tile directory not being fully cleaned up and
> reporting
> 
> duplicate creation.
> 
> The solution seems to be just to add a kobject_del before kobject_put in the
> fini
> 
> function.
> 
> I am only having warnings and simple returns on the init side.
> 
> If there can be a consensus about this from the reviewers, I can float the next
> version
> 
> accordingly.
> 
> Thanks,
> 
> Suja



More information about the Intel-xe mailing list