[PATCH 0/3] Use implicit kref infra

Thu Sep 3 01:57:49 UTC 2020

> 2020年9月2日 22:50，Tuikov, Luben <Luben.Tuikov at amd.com> 写道：
> 
> On 2020-09-02 00:43, Pan, Xinhui wrote:
>> 
>> 
>>> 2020年9月2日 11:46，Tuikov, Luben <Luben.Tuikov at amd.com> 写道：
>>> 
>>> On 2020-09-01 21:42, Pan, Xinhui wrote:
>>>> If you take a look at the below function, you should not use driver's release to free adev. As dev is embedded in adev.
>>> 
>>> Do you mean "look at the function below", using "below" as an adverb?
>>> "below" is not an adjective.
>>> 
>>> I know dev is embedded in adev--I did that patchset.
>>> 
>>>> 
>>>> 809 static void drm_dev_release(struct kref *ref)
>>>> 810 {
>>>> 811         struct drm_device *dev = container_of(ref, struct drm_device, ref);
>>>> 812        
>>>> 813         if (dev->driver->release)
>>>> 814                 dev->driver->release(dev);
>>>> 815 
>>>> 816         drm_managed_release(dev);
>>>> 817 
>>>> 818         kfree(dev->managed.final_kfree);
>>>> 819 }
>>> 
>>> That's simple--this comes from change c6603c740e0e3
>>> and it should be reverted. Simple as that.
>>> 
>>> The version before this change was absolutely correct:
>>> 
>>> static void drm_dev_release(struct kref *ref)
>>> {
>>> 	if (dev->driver->release)
>>> 		dev->driver->release(dev);
>>> 	else
>>> 		drm_dev_fini(dev);
>>> }
>>> 
>>> Meaning, "the kref is now 0"--> if the driver
>>> has a release, call it, else use our own.
>>> But note that nothing can be assumed after this point,
>>> about the existence of "dev".
>>> 
>>> It is exactly because struct drm_device is statically
>>> embedded into a container, struct amdgpu_device,
>>> that this change above should be reverted.
>>> 
>>> This is very similar to how fops has open/release
>>> but no close. That is, the "release" is called
>>> only when the last kref is released, i.e. when
>>> kref goes from non-zero to zero.
>>> 
>>> This uses the kref infrastructure which has been
>>> around for about 20 years in the Linux kernel.
>>> 
>>> I suggest reading the comments
>>> in drm_dev.c mostly, "DOC: driver instance overview"
>>> starting at line 240 onwards. This is right above
>>> drm_put_dev(). There is actually an example of a driver
>>> in the comment. Also the comment to drm_dev_init().
>>> 
>>> Now, take a look at this:
>>> 
>>> /**
>>> * drm_dev_put - Drop reference of a DRM device
>>> * @dev: device to drop reference of or NULL
>>> *
>>> * This decreases the ref-count of @dev by one. The device is destroyed if the
>>> * ref-count drops to zero.
>>> */
>>> void drm_dev_put(struct drm_device *dev)
>>> {
>>>       if (dev)
>>>               kref_put(&dev->ref, drm_dev_release);
>>> }
>>> EXPORT_SYMBOL(drm_dev_put);
>>> 
>>> Two things:
>>> 
>>> 1. It is us, who kzalloc the amdgpu device, which contains
>>> the drm_device (you'll see this discussed in the reading
>>> material I pointed to above). We do this because we're
>>> probing the PCI device whether we'll work it it or not.
>>> 
>> 
>> that is true.
> 
> Of course it's true--good morning!
> 
>> My understanding of the drm core code is like something below.
> 
> Let me stop you right there--just read the documentation I pointed
> to you at.
> 
>> struct B { 
>> 	strcut A 
>> }
>> we initialize A firstly and initialize B in the end. But destroy B firstly and destory A in the end.
> 
> No!
> B, which is the amdgpu_device struct "exists" before A, which is the DRM struct.
> This is why DRM recommends to _embed_ it into the driver's own device struct,
> as the documentation I pointed you to at.
> 
I think you are misleading me here.  A pci dev as you said below can act as many roles, a drm dev, a tty dev, etc.
say, struct B{
struct A;
struct TTY;
struct printer;
...
}
but TTY or other members has nothing to do with our discussion.

B of course exists before A. but the code logic is not that. code below is really rare in drm world.
create_B()
{
	init B members
	return create_A()
}
So usually B have more work to do after it initialize A.
then code should like below
create_B()
{
	init B base members
	create_A()
	init B extended members
}

For release part.
release B extended member
release A
release B base member

a good design should not have the so-called extended and base members existing in the release process.
Now have a look at the drm core code.
it expects driver to do release process like below.
release B
cleanup work of A

as long as the cleanup work of A exists, we can not do a pure release of B.

So if you want to follow the ruls of kref, you have to rework the drm core code first. only after that, we can do a pure release of B.

What I am confused is that, kfer sits in drm dev. why adev must be destroyed too when drm dev is going to be destroyed.
adev is not equal to drm dev.
I think struct below is more fit for the logic.
struct adev {
	struct drm * ddev_p = &adev.ddev
	struct type *odev_p  = &adev.odev
	struct drm ddev
	struct type odev
}

> "undone" first, since the DRM layer may finish with a device, but
> the device may still exists with the driver and as well as with PCI.
> This is very VERY common, with kernels, devices, device abstractions,
> device layers: DRM dev <-- amdgpu dev <-- PCI dev.
> 
>> But yes, practice is more complex. 
>> if B has nothing to be destroyed. we can destory A directly, otherwise destroy B firstly.
> 
> I'm sorry, that doesn't make sense. There is no such thing as "destroy directly"
> and "otherwise"--this is absolutely not how this works.
> 
> A good architecture doesn't have if-then-else--it's just a pure single-branch path.

well, there is code below everywhere.
if (fops->release)
	fops->release
else
	default_release

>> 
>> in this case, we can do something below in our release()
>> //some cleanup work of B
>> drm_dev_fini(dev);//destroy A
>> kfree(adev)
>> 
>>> 2. Using the kref infrastructure, when the ref goes to 0,
>>> drm_dev_release is called. And here's the KEY:
>>> Because WE allocated the container, we should free it--after the release
>>> method is called, DRM cannot assume anything about the drm
>>> device or the container. The "release" method is final.
>>> 
>>> We allocate, we free. And we free only when the ref goes to 0.
>>> 
>>> DRM can, in due time, "free" itself of the DRM device and stop
>>> having knowledge of it--that's fine, but as long as the ref
>>> is not 0, the amdgpu device and thus the contained DRM device,
>>> cannot be freed.
>>> 
>>>> 
>>>> You have to make another change something like
>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>>>> index 13068fdf4331..2aabd2b4c63b 100644
>>>> --- a/drivers/gpu/drm/drm_drv.c
>>>> +++ b/drivers/gpu/drm/drm_drv.c
>>>> @@ -815,7 +815,8 @@ static void drm_dev_release(struct kref *ref)
>>>> 
>>>>       drm_managed_release(dev);
>>>> 
>>>> -       kfree(dev->managed.final_kfree);
>>>> +       if (dev->driver->final_release)
>>>> +               dev->driver->final_release(dev);
>>>> }
>>> 
>>> No. What's this?
>>> There is no such thing as "final" release, nor is there a "partial" release.
>>> When the kref goes to 0, the device disappears. Simple.
>>> If someone is using it, they should kref-get it, and when they're
>>> done with it, they should kref-put it.
>> 
>> I just take an example here. add another release in the end. then no one could touch us. IOW, final_release.
> 
> No, that's horrible.
> 
>> 
>> 
>> A destroy B by a callback, then A destroy itself. It assumes B just free its own resource.
>> but that makes trouble if some resource of A is allocated by B.
>> Because B must take care of these common resource shared between A and B.
> 
> No, that's horrible.
> 
>> 
>> yes, that logical is more complex. So I think we can revert drm_dev_release to its previous version.
> 
> drm_dev_release() in its original form, was pure:
> 
> static void drm_dev_release(struct kref *ref)
> {
> 	if (dev->driver->release)
> 		dev->driver->release(dev);
> 	else
> 		drm_dev_fini(dev);
> }
> 
>> 
>>> 
>>> The whole point is that this is done implicitly, via the kref infrastructure.
>>> drm_dev_init() which we call in our PCI probe function, sets the kref to 1--all
>>> as per the documentation I pointed you to above.
>> 
>> I am not taking about kref. what we are discussing is all about the release sequence.
> 
> You need to understand how the kref infrastructure works in the kernel. I've said
> it a million times: it's implicit. The "release sequence" as you like to call it,
> is implicit in the kref infrastructure.
> 
>> 
>> 
>>> 
>>> Another point is that we can do some other stuff in the release
>>> function, notify someone, write some registers, free memory we use
>>> for that PCI device, etc.
>>> 
>>> If the "managed resources" infrastructure wants to stay, it should hook
>>> itself into drm_dev_fini() and into drm_dev_init() or drm_dev_register().
>>> It shouldn't have to be so out-of-place like in patch 2/3 of this series,
>>> where the drmm_add_final_kfree() is smack-dab in the middle of our PCI
>>> discovery function, surrounded on top and bottom by drm_dev_init()
>>> and drm_dev_register(). The "managed resources" infra should be non-invasive
>>> and drivers shouldn't have to change to use it--it should be invisible to them.
>>> Then our kref would just work.
>>> 
>> yep, that make sense. But you need more changes to fix this issue. this patchset is insufficient.
> 
> Or LESS. Less changes. Less is better. Basically revert and redo all this "managed resources".
> 
> Regards,
> Luben
> 
>> 
>> 
>>>> 
>>>> And in the final_release callback we free the dev. But that is a little complex now. so I prefer still using final_kfree.
>>>> Of course we can do some cleanup work in the driver's release callback. BUT no kfree.
>>> 
>>> No! No final_kfree. It's a hack.
>>> 
>>> Read the documentation in drm_drv.c I noted above--it lays out how this happens. Reading is required.
>>> 
>>> Regards,
>>> Luben
>>> 
>>> 
>>>> 
>>>> -----原始邮件-----
>>>> 发件人: "Tuikov, Luben" <Luben.Tuikov at amd.com>
>>>> 日期: 2020年9月2日 星期三 09:07
>>>> 收件人: "amd-gfx at lists.freedesktop.org" <amd-gfx at lists.freedesktop.org>, "dri-devel at lists.freedesktop.org" <dri-devel at lists.freedesktop.org>
>>>> 抄送: "Deucher, Alexander" <Alexander.Deucher at amd.com>, Daniel Vetter <daniel at ffwll.ch>, "Pan, Xinhui" <Xinhui.Pan at amd.com>, "Tuikov, Luben" <Luben.Tuikov at amd.com>
>>>> 主题: [PATCH 0/3] Use implicit kref infra
>>>> 
>>>>   Use the implicit kref infrastructure to free the container
>>>>   struct amdgpu_device, container of struct drm_device.
>>>> 
>>>>   First, in drm_dev_register(), do not indiscriminately warn
>>>>   when a DRM driver hasn't opted for managed.final_kfree,
>>>>   but instead check if the driver has provided its own
>>>>   "release" function callback in the DRM driver structure.
>>>>   If that is the case, no warning.
>>>> 
>>>>   Remove drmm_add_final_kfree(). We take care of that, in the
>>>>   kref "release" callback when all refs are down to 0, via
>>>>   drm_dev_put(), i.e. the free is implicit.
>>>> 
>>>>   Remove superfluous NULL check, since the DRM device to be
>>>>   suspended always exists, so long as the underlying PCI and
>>>>   DRM devices exist.
>>>> 
>>>>   Luben Tuikov (3):
>>>>     drm: No warn for drivers who provide release
>>>>     drm/amdgpu: Remove drmm final free
>>>>     drm/amdgpu: Remove superfluous NULL check
>>>> 
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 2 --
>>>>    drivers/gpu/drm/drm_drv.c                  | 3 ++-
>>>>    3 files changed, 2 insertions(+), 6 deletions(-)
>>>> 
>>>>   -- 
>>>>   2.28.0.394.ge197136389
>>>> 
>>>> 
>>>> 
>>> 
>> 
>