[Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs
Andi Shyti
andi.shyti at linux.intel.com
Sun Apr 24 22:36:23 UTC 2022
Hi Andrzej and Ashutosh,
> > > > b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > index 937b2e1a305e..4c72b4f983a6 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > @@ -222,6 +222,9 @@ struct intel_gt {
> > > > } mocs;
> > > > struct intel_pxp pxp;
> > > > +
> > > > + /* gt/gtN sysfs */
> > > > + struct kobject sysfs_gtn;
> > > If you put kobject as a part of intel_gt what assures you that lifetime of
> > > kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
> > > intel_gt?
> > Because we are explicitly doing a kobject_put() in
> > intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
> > the previous code.
> >
> > Let me explain a bit about the previous code (but feel free to skip since
> > the patch should speak for itself):
> > * Previously we kzalloc a 'struct kobj_gt'
> > * But we don't save a pointer to the 'struct kobj_gt' so we don't have the
> > pointer to the kobject to be able to do a kobject_put() on it later
> > * Therefore we need to store the pointer in 'struct intel_gt'
> > * But if we have to put the pointer in 'struct intel_gt' we might as well
> > put the kobject as part of 'struct intel_gt' and that also removes the
> > need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of()
> > to get gt from kobj).
> > * So I think this patch simpler/cleaner than the original code if you take
> > the requirement for kobject_put() into account.
This is my oversight. This was something I completely forgot to
fix but it was my intention to do and actually I had some fixes
ongoing. But because this patch took too long to get in I
completely forgot about it (Sujaritha was actually the first who
pointed this out).
Thanks, Ashutosh for taking this.
> I fully agree that previous code is incorrect but I am not convinced current
> code is correct.
> If some objects are kref-counted it means usually they can have multiple
> concurrent users and kobject_put does not work as traditional
> destructor/cleanup/unregister.
> So in this particular case after calling kobject_init_and_add sysfs core can
> get multiple references on the object. Later, during driver unregistration
> kobject_put is called, but if the object is still in use by sysfs core, the
> object will not be destroyed/released. If the driver unregistration
> continues memory will be freed, leaving sysfs-core (or other users) with
> dangling pointers. Unless there is some additional synchronization mechanism
> I am not aware of.
Thanks Andrzej for summarizing this and what you said is actually
what happens. I had a similar solution developed and I had wrong
pointer reference happening.
Thanks,
Andi
More information about the Intel-gfx
mailing list