[PATCH 0/2] Fix a couple of amdgpu de-initialization failures
John Brooks
john at fastquake.com
Sat Jul 1 17:13:00 UTC 2017
These patches fix problems that occur when attempting to unload the amdgpu
module on my R9 290. It now unloads without any (obvious) errors.
Unfortunately, however, I hit a snag when trying to load the module again after
unloading it:
[ 150.849380] [drm] amdgpu kernel modesetting enabled.
[ 150.849471] [drm] initializing kernel modesetting (HAWAII 0x1002:0x67B1 0x1043:0x0470 0x00).
[ 150.849483] [drm] register mmio base: 0xF7E00000
[ 150.849483] [drm] register mmio size: 262144
[ 150.849488] [drm] probing gen 2 caps for device 8086:c01 = 261ad03/e
[ 150.849489] [drm] probing mlw for device 8086:c01 = 261ad03
[ 150.980046] [drm] BIOS signature incorrect 0 0
[ 150.980050] amdgpu 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
[ 150.980081] ATOM BIOS: 113-AD63300-102
[ 150.980087] [drm] GPU post is not needed
[ 150.980215] [drm] vm size is 64 GB, block size is 13-bit
[ 150.980220] amdgpu 0000:01:00.0: VRAM: 4096M 0x0000000000000000 - 0x00000000FFFFFFFF (4096M used)
[ 150.980221] amdgpu 0000:01:00.0: GTT: 4096M 0x0000000100000000 - 0x00000001FFFFFFFF
[ 150.980224] [drm] Detected VRAM RAM=4096M, BAR=256M
[ 150.980224] [drm] RAM width 512bits GDDR5
[ 150.980324] [TTM] Zone kernel: Available graphics memory: 10276058 kiB
[ 150.980324] [TTM] Zone dma32: Available graphics memory: 2097152 kiB
[ 150.980325] [TTM] Initializing pool allocator
[ 150.980337] [TTM] Initializing DMA pool allocator
[ 150.980351] [drm] amdgpu: 4096M of VRAM memory ready
[ 150.980352] [drm] amdgpu: 4096M of GTT memory ready.
[ 150.980368] [drm] GART: num cpu pages 1048576, num gpu pages 1048576
[ 151.101593] [drm] PCIE GART of 4096M enabled (table at 0x0000000000040000).
[ 151.101611] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 151.101611] [drm] Driver supports precise vblank timestamp query.
[ 151.101633] amdgpu 0000:01:00.0: amdgpu: using MSI.
[ 151.101643] [drm] amdgpu: irq initialized.
[ 151.105091] [drm] Internal thermal controller with fan control
[ 151.110444] [drm] Invalid PCC GPIO: 13!
[ 151.110445] [drm] amdgpu: dpm initialized
[ 151.110610] [drm] AMDGPU Display Connectors
[ 151.110611] [drm] Connector 0:
[ 151.110611] [drm] DP-1
[ 151.110612] [drm] HPD2
[ 151.110613] [drm] DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
[ 151.110613] [drm] Encoders:
[ 151.110614] [drm] DFP1: INTERNAL_UNIPHY2
[ 151.110614] [drm] Connector 1:
[ 151.110614] [drm] HDMI-A-1
[ 151.110615] [drm] HPD3
[ 151.110615] [drm] DDC: 0x1954 0x1954 0x1955 0x1955 0x1956 0x1956 0x1957 0x1957
[ 151.110615] [drm] Encoders:
[ 151.110616] [drm] DFP2: INTERNAL_UNIPHY2
[ 151.110627] [drm] Connector 2:
[ 151.110628] [drm] DVI-D-1
[ 151.110628] [drm] HPD1
[ 151.110628] [drm] DDC: 0x1958 0x1958 0x1959 0x1959 0x195a 0x195a 0x195b 0x195b
[ 151.110629] [drm] Encoders:
[ 151.110629] [drm] DFP3: INTERNAL_UNIPHY1
[ 151.110629] [drm] Connector 3:
[ 151.110629] [drm] DVI-D-2
[ 151.110630] [drm] HPD6
[ 151.110630] [drm] DDC: 0x1960 0x1960 0x1961 0x1961 0x1962 0x1962 0x1963 0x1963
[ 151.110630] [drm] Encoders:
[ 151.110631] [drm] DFP4: INTERNAL_UNIPHY
[ 151.111727] amdgpu 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000100000010, cpu addr 0xffff8805029f4010
[ 151.111798] amdgpu 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000100000020, cpu addr 0xffff8805029f4020
[ 151.111860] amdgpu 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000100000030, cpu addr 0xffff8805029f4030
[ 151.111914] amdgpu 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000100000040, cpu addr 0xffff8805029f4040
[ 151.111943] amdgpu 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000100000050, cpu addr 0xffff8805029f4050
[ 151.111958] amdgpu 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000100000060, cpu addr 0xffff8805029f4060
[ 151.111971] amdgpu 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000100000070, cpu addr 0xffff8805029f4070
[ 151.111986] amdgpu 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000100000080, cpu addr 0xffff8805029f4080
[ 151.112007] amdgpu 0000:01:00.0: fence driver on ring 8 use gpu addr 0x0000000100000090, cpu addr 0xffff8805029f4090
[ 151.112297] amdgpu 0000:01:00.0: fence driver on ring 9 use gpu addr 0x00000001000000a0, cpu addr 0xffff8805029f40a0
[ 151.112321] amdgpu 0000:01:00.0: fence driver on ring 10 use gpu addr 0x00000001000000b0, cpu addr 0xffff8805029f40b0
[ 151.113021] [drm] Found UVD firmware Version: 1.64 Family ID: 9
[ 151.113316] amdgpu 0000:01:00.0: fence driver on ring 11 use gpu addr 0x000000000088bd30, cpu addr 0xffffc90009a38d30
[ 151.113762] [drm] Found VCE firmware Version: 50.10 Binary ID: 2
[ 151.113815] amdgpu 0000:01:00.0: fence driver on ring 12 use gpu addr 0x00000001000000d0, cpu addr 0xffff8805029f40d0
[ 151.113853] amdgpu 0000:01:00.0: fence driver on ring 13 use gpu addr 0x00000001000000e0, cpu addr 0xffff8805029f40e0
[ 151.113901] [drm] PCIE gen 3 link speeds already enabled
[ 151.124088] [drm] ring test on 0 succeeded in 16 usecs
[ 152.323771] [drm] ring test on 1 succeeded in 675 usecs
[ 152.323794] [drm] ring test on 2 succeeded in 14 usecs
[ 152.323818] [drm] ring test on 3 succeeded in 15 usecs
[ 152.323841] [drm] ring test on 4 succeeded in 15 usecs
[ 152.323865] [drm] ring test on 5 succeeded in 15 usecs
[ 152.323890] [drm] ring test on 6 succeeded in 16 usecs
[ 152.323914] [drm] ring test on 7 succeeded in 15 usecs
[ 152.323938] [drm] ring test on 8 succeeded in 15 usecs
[ 152.428850] [drm:cik_sdma_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 9 test failed (0xCAFEDEAD)
[ 152.428869] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <cik_sdma> failed -22
[ 152.428872] amdgpu 0000:01:00.0: amdgpu_init failed
[ 153.481236] [TTM] Finalizing pool allocator
[ 153.481239] [TTM] Finalizing DMA pool allocator
[ 153.481316] [TTM] Zone kernel: Used memory at exit: 0 kiB
[ 153.481318] [TTM] Zone dma32: Used memory at exit: 0 kiB
[ 153.481320] [drm] amdgpu: ttm finalized
[ 153.481328] amdgpu 0000:01:00.0: Fatal error during GPU init
[ 153.481333] [drm] amdgpu: finishing device.
[ 153.481334] [TTM] Memory type 2 has not been initialized
[ 153.494261] amdgpu: probe of 0000:01:00.0 failed with error -22
A subsequent attempt to load the module produced similar results, except the
ring test failed on ring 1 as well as 9, so I suppose it's intermittent. Maybe
someone with access to the hardware interface specifications can figure out
why. But at least it unloads now.
--
John Brooks (Frogging101)
More information about the amd-gfx
mailing list