<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - Cannot unbind GPU from AMDGPU"
href="https://bugs.freedesktop.org/show_bug.cgi?id=97500#c6">Comment # 6</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - Cannot unbind GPU from AMDGPU"
href="https://bugs.freedesktop.org/show_bug.cgi?id=97500">bug 97500</a>
from <span class="vcard"><a class="email" href="mailto:notasas@gmail.com" title="Grazvydas Ignotas <notasas@gmail.com>"> <span class="fn">Grazvydas Ignotas</span></a>
</span></b>
<pre>Created <span class=""><a href="attachment.cgi?id=126782" name="attach_126782" title="dmesg of powerplay crash">attachment 126782</a> <a href="attachment.cgi?id=126782&action=edit" title="dmesg of powerplay crash">[details]</a></span>
dmesg of powerplay crash
I've sent some patches with fixes, but there seem to be multiple other issues.
One of the problems is that struct amdgpu_i2c_chan contains struct drm_dp_aux,
and on amdgpu_i2c_fini() call, which frees amdgpu_i2c_chan, drm_dp_aux is still
in use. This causes memory corruption. Don't know how to solve this, perhaps
somebody knows this code better?
A hack can be used to trade this corruption for a leak:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
index 34bab61..8beaee0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@@ -221,6 +221,8 @@ void amdgpu_i2c_destroy(struct amdgpu_i2c_chan *i2c)
if (!i2c)
return;
i2c_del_adapter(&i2c->adapter);
+ if (i2c->has_aux)
+ return;
kfree(i2c);
}
---
Another one is TTM leak, can also be seen in this attachment.
CONFIG_DMA_API_DEBUG reports:
WARNING: CPU: 3 PID: 1666 at lib/dma-debug.c:976
dma_debug_device_change+0x1ca/0x240
pci 0000:01:00.0: DMA-API: device driver has pending DMA allocations while
released from device [count=202]
One of leaked entries details: [device address=0x00000003dcfe9000] [size=4096
bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent]
Mapped at:
[<ffffffff8163d941>] debug_dma_alloc_coherent+0x41/0x110
[<ffffffffa0728d84>] ttm_dma_populate+0xb64/0x1150 [ttm]
[<ffffffffa0b770ac>] amdgpu_ttm_tt_populate+0x35c/0x510 [amdgpu]
[<ffffffffa0719141>] ttm_tt_bind+0x71/0xd0 [ttm]
[<ffffffffa071c9d8>] ttm_bo_handle_move_mem+0xa08/0xaa0 [ttm]
---
Next one is powerplay crash in
drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:3336 ,
dpm_table->sclk_table.count is 0 so array access ends up badly. Could be
related to "DPM is already running right now, no need to enable DPM!" message,
full dmesg attached.
I won't have time to work on this for a while, but maybe somebody else does.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>