<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body>
<p style="font-family:Arial;font-size:10pt;color:#0078D7;margin:15pt;" align="Left">
[AMD Official Use Only - Internal Distribution Only]<br>
</p>
<br>
<div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
<br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
The correct thing to do this is to<br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
_leave the amdgpu_driver_release()_ alone,<br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
remove "drmm_add_final_kfree()" and qualify<br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
the WARN_ON() in drm_dev_register() by<br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
the existence of drm_driver.release() (i.e. non-NULL).<br>
<br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
Re: this drm driver release callback is more like a release notify. It is called in the beginning of the total release sequence. As you have made drm device a member of adev. So you can not free adev in the driver release callback.<br>
<br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
Maybe change the sequence of release, say, put drm driver release in the end of total release sequence.
<br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: sans-serif; font-size: 11pt; color: black; ">
Or still use the final_kfree to free adev and our release callback just do some other cleanup work.</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Tuikov, Luben <Luben.Tuikov@amd.com><br>
<b>Sent:</b> Wednesday, September 2, 2020 4:35:32 AM<br>
<b>To:</b> Alex Deucher <alexdeucher@gmail.com>; Pan, Xinhui <Xinhui.Pan@amd.com>; Daniel Vetter <daniel.vetter@ffwll.ch><br>
<b>Cc:</b> amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Deucher, Alexander <Alexander.Deucher@amd.com><br>
<b>Subject:</b> Re: [PATCH] drm/amdgpu: Fix a redundant kfree</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">On 2020-09-01 10:12 a.m., Alex Deucher wrote:<br>
> On Tue, Sep 1, 2020 at 3:46 AM Pan, Xinhui <Xinhui.Pan@amd.com> wrote:<br>
>><br>
>> [AMD Official Use Only - Internal Distribution Only]<br>
>><br>
>> drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free<br>
>> itself.<br>
>> Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into<br>
>> amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it.<br>
>> So drm_dev_release try to free a wrong pointer then.<br>
>><br>
>> Also driver's release trys to free adev, but drm_dev_release will<br>
>> access dev after call drvier's release.<br>
>><br>
>> To fix it, remove driver's release and set managed.final_kfree to adev.<br>
> <br>
> I've got to admit, the documentation around drm_dev_init is hard to<br>
> follow. We aren't really using the drmm stuff, but you have to use<br>
> drmm_add_final_kfree to avoid a warning. The logic seems to make<br>
> sense though.<br>
> Acked-by: Alex Deucher <alexancer.deucher@amd.com><br>
<br>
The logic in patch 3/3 uses the kref infrastructure<br>
as described in drm_drv.c's comment on what the DRM<br>
usage is, i.e. taking advantage of the kref infrastructure.<br>
<br>
In amdgpu_pci_probe() we call drm_dev_init() which takes<br>
a ref of 1 on the kref in the DRM device structure,<br>
and from then on, only when the kref transitions<br>
from non-zero to 0, do we free the container of<br>
DRM device, and this is beautifully shown in the<br>
kernel stack below (please take a look at the kernel<br>
stack below).<br>
<br>
Using a kref is very powerful as it is implicit:<br>
when the kref transitions from non-zero to 0,<br>
then call the release method.<br>
<br>
Furthermore, we own the release method, and we<br>
like that, as it is pure, as well as,<br>
there may be more things we'd like to do in the future<br>
before we free the amdgpu device: maybe free memory we're<br>
using specifically for that PCI device, maybe write<br>
some registers, maybe notify someone or something, etc.<br>
<br>
On another note, adding "drmm_add_final_kfree()" in the middle<br>
of amdgpu_pci_probe() seems hackish--it's neither part<br>
of drm_dev_init() nor of drm_dev_register(). We really<br>
don't need it, since we rely on the kref infrastructure<br>
to tell us when to free the device, and if you'd look<br>
at the beautiful stack below, it knows exactly when that is,<br>
i.e. when to free it.<br>
<br>
The correct thing to do this is to<br>
_leave the amdgpu_driver_release()_ alone,<br>
remove "drmm_add_final_kfree()" and qualify<br>
the WARN_ON() in drm_dev_register() by<br>
the existence of drm_driver.release() (i.e. non-NULL).<br>
<br>
I'll submit a sequence of patches to fix this right.<br>
<br>
Regards,<br>
Luben<br>
<br>
> <br>
>><br>
>> [ 36.269348] BUG: unable to handle page fault for address: ffffa0c279940028<br>
>> [ 36.276841] #PF: supervisor read access in kernel mode<br>
>> [ 36.282434] #PF: error_code(0x0000) - not-present page<br>
>> [ 36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 800ffff8066bf060<br>
>> [ 36.296868] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI<br>
>> [ 36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G O 5.9.0-rc2+ #46<br>
>> [ 36.310670] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019<br>
>> [ 36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm]<br>
>> [ 36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 be 00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 8b 7f 18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5<br>
>> [ 36.347217] RSP: 0018:ffffb9424141fce0 EFLAGS: 00010282<br>
>> [ 36.352931] RAX: 0000000000000006 RBX: ffffa0c279940010 RCX: 0000000000000006<br>
>> [ 36.360718] RDX: ffffffffc0419f5a RSI: 0000000000000200 RDI: ffffa0c279940010<br>
>> [ 36.368503] RBP: ffffb9424141fd10 R08: 0000000000000001 R09: 0000000000000001<br>
>> [ 36.376304] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0c279940010<br>
>> [ 36.384070] R13: ffffffffc0e2a000 R14: ffffa0c26924e220 R15: fffffffffffffff2<br>
>> [ 36.391845] FS: 00007fc4a277b740(0000) GS:ffffa0c288e00000(0000) knlGS:0000000000000000<br>
>> [ 36.400669] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033<br>
>> [ 36.406937] CR2: ffffa0c279940028 CR3: 0000000792304006 CR4: 00000000003706e0<br>
>> [ 36.414732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000<br>
>> [ 36.422550] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400<br>
>> [ 36.430354] Call Trace:<br>
>> [ 36.433044] drm_dev_put.part.0+0x40/0x60 [drm]<br>
>> [ 36.438017] drm_dev_put+0x13/0x20 [drm]<br>
>> [ 36.442398] amdgpu_pci_remove+0x56/0x60 [amdgpu]<br>
>> [ 36.447528] pci_device_remove+0x3e/0xb0<br>
>> [ 36.451807] device_release_driver_internal+0xff/0x1d0<br>
>> [ 36.457416] device_release_driver+0x12/0x20<br>
>> [ 36.462094] pci_stop_bus_device+0x70/0xa0<br>
>> [ 36.466588] pci_stop_and_remove_bus_device_locked+0x1b/0x30<br>
>> [ 36.472786] remove_store+0x7b/0x90<br>
>> [ 36.476614] dev_attr_store+0x17/0x30<br>
>> [ 36.480646] sysfs_kf_write+0x4b/0x60<br>
>> [ 36.484655] kernfs_fop_write+0xe8/0x1d0<br>
>> [ 36.488952] vfs_write+0xf5/0x230<br>
>> [ 36.492562] ksys_write+0x70/0xf0<br>
>> [ 36.496206] __x64_sys_write+0x1a/0x20<br>
>> [ 36.500292] do_syscall_64+0x38/0x90<br>
>> [ 36.504219] entry_SYSCALL_64_after_hwframe+0x44/0xa9<br>
>><br>
>> Signed-off-by: xinhui pan <xinhui.pan@amd.com><br>
>> ---<br>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +---------<br>
>> 1 file changed, 1 insertion(+), 9 deletions(-)<br>
>><br>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
>> index c12e9acd421d..52fc0c6c8538 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
>> @@ -1165,7 +1165,7 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,<br>
>> if (ret)<br>
>> goto err_free;<br>
>><br>
>> -drmm_add_final_kfree(ddev, ddev);<br>
>> +drmm_add_final_kfree(ddev, adev);<br>
>><br>
>> if (!supports_atomic)<br>
>> ddev->driver_features &= ~DRIVER_ATOMIC;<br>
>> @@ -1221,13 +1221,6 @@ amdgpu_pci_remove(struct pci_dev *pdev)<br>
>> drm_dev_put(dev);<br>
>> }<br>
>><br>
>> -static void amdgpu_driver_release(struct drm_device *ddev)<br>
>> -{<br>
>> -struct amdgpu_device *adev = drm_to_adev(ddev);<br>
>> -<br>
>> -kfree(adev);<br>
>> -}<br>
>> -<br>
>> static void<br>
>> amdgpu_pci_shutdown(struct pci_dev *pdev)<br>
>> {<br>
>> @@ -1522,7 +1515,6 @@ static struct drm_driver kms_driver = {<br>
>> .open = amdgpu_driver_open_kms,<br>
>> .postclose = amdgpu_driver_postclose_kms,<br>
>> .lastclose = amdgpu_driver_lastclose_kms,<br>
>> -.release = amdgpu_driver_release,<br>
>> .irq_handler = amdgpu_irq_handler,<br>
>> .ioctls = amdgpu_ioctls_kms,<br>
>> .gem_free_object_unlocked = amdgpu_gem_object_free,<br>
>> --<br>
>> 2.25.1<br>
>><br>
>> _______________________________________________<br>
>> amd-gfx mailing list<br>
>> amd-gfx@lists.freedesktop.org<br>
>> <a href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CLuben.Tuikov%40amd.com%7C5233b0caaac449417a7208d84e810de0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637345663485193534&sdata=A%2FzJpoQmUhGBGts5mQfp%2BlVu1FL5NVbe5qfRelb2Og8%3D&reserved=0">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CLuben.Tuikov%40amd.com%7C5233b0caaac449417a7208d84e810de0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637345663485193534&sdata=A%2FzJpoQmUhGBGts5mQfp%2BlVu1FL5NVbe5qfRelb2Og8%3D&reserved=0</a><br>
<br>
</div>
</span></font></div>
</div>
</body>
</html>