[Bug 214029] [bisected] [NAVI] Several memory leaks in amdgpu and ttm

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Wed Sep 22 22:05:07 UTC 2021


https://bugzilla.kernel.org/show_bug.cgi?id=214029

--- Comment #14 from Erhard F. (erhard_f at mailbox.org) ---
Created attachment 298927
  --> https://bugzilla.kernel.org/attachment.cgi?id=298927&action=edit
kernel dmesg (kernel 5.14.6, AMD Opteron 6386 SE)

Does not seem to be Navi specific after all as the leaks do happen with the
Radeon R7 360 in my Opteron box too.

[...]
unreferenced object 0xffff8afeddd0c2c0 (size 176):
  comm "Web Content", pid 1830253, jiffies 4302445561 (age 2701.157s)
  hex dump (first 32 bytes):
    50 c3 d0 dd fe 8a ff ff 80 51 3a c0 ff ff ff ff  P........Q:.....
    0f 89 14 e9 f1 16 00 00 48 fe b6 09 41 a7 ff ff  ........H...A...
  backtrace:
    [<ffffffffc03a347d>] drm_sched_fence_create+0x1d/0xb0 [gpu_sched]
    [<ffffffffc03a20d0>] drm_sched_job_init+0x58/0xa0 [gpu_sched]
    [<ffffffffc10fb711>] amdgpu_job_submit+0x21/0xe0 [amdgpu]
    [<ffffffffc0feef6a>] amdgpu_copy_buffer+0x1ea/0x290 [amdgpu]
    [<ffffffffc0fef292>] amdgpu_ttm_copy_mem_to_mem+0x282/0x5b0 [amdgpu]
    [<ffffffffc0fefad8>] amdgpu_bo_move+0x130/0x7d8 [amdgpu]
    [<ffffffffc0609e49>] ttm_bo_handle_move_mem+0x89/0x178 [ttm]
    [<ffffffffc060b1ba>] ttm_bo_validate+0xba/0x140 [ttm]
    [<ffffffffc0ff13ae>] amdgpu_bo_fault_reserve_notify+0xb6/0x160 [amdgpu]
    [<ffffffffc0ff62f8>] amdgpu_gem_fault+0x78/0x100 [amdgpu]
    [<ffffffff9b166941>] __do_fault+0x31/0xe8
    [<ffffffff9b16dc4a>] __handle_mm_fault+0xe1a/0x1290
    [<ffffffff9b16e175>] handle_mm_fault+0xb5/0x218
    [<ffffffff9b6ca347>] exc_page_fault+0x177/0x5d0
    [<ffffffff9b800acb>] asm_exc_page_fault+0x1b/0x20
unreferenced object 0xffff8b01f00bd0c0 (size 72):
  comm "sdma0", pid 403, jiffies 4302445561 (age 2701.157s)
  hex dump (first 32 bytes):
    e0 c7 64 13 ff 8a ff ff 00 1c 30 c1 ff ff ff ff  ..d.......0.....
    65 59 16 e9 f1 16 00 00 58 28 b9 86 03 8b ff ff  eY......X(......
  backtrace:
    [<ffffffffc0febecb>] amdgpu_fence_emit+0x2b/0x1f0 [amdgpu]
    [<ffffffffc100945b>] amdgpu_ib_schedule+0x2e3/0x4e8 [amdgpu]
    [<ffffffffc10fb34b>] amdgpu_job_run+0x8b/0x1e8 [amdgpu]
    [<ffffffffc03a2ad7>] drm_sched_main+0x1b7/0x3d8 [gpu_sched]
    [<ffffffff9b05f9e2>] kthread+0x122/0x140
    [<ffffffff9b001102>] ret_from_fork+0x22/0x30
unreferenced object 0xffff8b02ec1796c0 (size 176):
  comm "Renderer", pid 108402, jiffies 4302694486 (age 1871.424s)
  hex dump (first 32 bytes):
    50 97 17 ec 02 8b ff ff 80 51 3a c0 ff ff ff ff  P........Q:.....
    4f 9c 02 1a b3 17 00 00 48 fe b6 09 41 a7 ff ff  O.......H...A...
  backtrace:
    [<ffffffffc03a347d>] drm_sched_fence_create+0x1d/0xb0 [gpu_sched]
    [<ffffffffc03a20d0>] drm_sched_job_init+0x58/0xa0 [gpu_sched]
    [<ffffffffc10fb711>] amdgpu_job_submit+0x21/0xe0 [amdgpu]
    [<ffffffffc0feef6a>] amdgpu_copy_buffer+0x1ea/0x290 [amdgpu]
    [<ffffffffc0fef292>] amdgpu_ttm_copy_mem_to_mem+0x282/0x5b0 [amdgpu]
    [<ffffffffc0fefad8>] amdgpu_bo_move+0x130/0x7d8 [amdgpu]
    [<ffffffffc0609e49>] ttm_bo_handle_move_mem+0x89/0x178 [ttm]
    [<ffffffffc060b1ba>] ttm_bo_validate+0xba/0x140 [ttm]
    [<ffffffffc0ff13ae>] amdgpu_bo_fault_reserve_notify+0xb6/0x160 [amdgpu]
    [<ffffffffc0ff62f8>] amdgpu_gem_fault+0x78/0x100 [amdgpu]
    [<ffffffff9b166941>] __do_fault+0x31/0xe8
    [<ffffffff9b16dc4a>] __handle_mm_fault+0xe1a/0x1290
    [<ffffffff9b16e175>] handle_mm_fault+0xb5/0x218
    [<ffffffff9b6ca347>] exc_page_fault+0x177/0x5d0
    [<ffffffff9b800acb>] asm_exc_page_fault+0x1b/0x20


 # lspci -s 01:00.0 -v
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Tobago PRO [Radeon R7 360 / R9 360 OEM] (rev 81) (prog-if 00 [VGA controller])
        Subsystem: PC Partner Limited / Sapphire Technology Tobago PRO [Radeon
R7 360 / R9 360 OEM]
        Flags: bus master, fast devsel, latency 0, IRQ 47, IOMMU group 11
        Memory at d0000000 (64-bit, prefetchable) [size=256M]
        Memory at cf800000 (64-bit, prefetchable) [size=8M]
        I/O ports at c000 [size=256]
        Memory at fdc80000 (32-bit, non-prefetchable) [size=256K]
        Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [58] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010
<?>
        Capabilities: [150] Advanced Error Reporting
        Capabilities: [200] Physical Resizable BAR
        Capabilities: [270] Secondary PCI Express
        Capabilities: [2b0] Address Translation Service (ATS)
        Capabilities: [2c0] Page Request Interface (PRI)
        Capabilities: [2d0] Process Address Space ID (PASID)
        Kernel driver in use: amdgpu
        Kernel modules: radeon, amdgpu

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list