bug report with amdgpu drm-next-6.2
Dennis Gilmore
dennis at ausil.us
Sat Nov 5 22:54:51 UTC 2022
I have an Ampere Altra machine that I have put a Radeon 6500 XT card
on seeing the pull request for drm-next-6.2
https://lore.kernel.org/dri-devel/20221104205827.6008-1-alexander.deucher@amd.com/
I grabbed the tree and built it. On boot I am getting
[ 23.877255] [drm] amdgpu kernel modesetting enabled.
[ 23.883774] amdgpu: CRAT table disabled by module option
[ 23.889530] amdgpu: IO link not available for non x86 platforms
[ 23.890039] ixgbe 0005:01:00.1 enP5p1s0f1: renamed from eth1
[ 23.895485] amdgpu: Virtual CRAT table created for CPU
[ 23.907170] amdgpu: Topology: Add CPU node
[ 23.913989] amdgpu 000d:03:00.0: Adding to iommu group 23
[ 23.923394] [drm] initializing kernel modesetting (BEIGE_GOBY
0x1002:0x743F 0x1EAE:0x6401 0xC1).
[ 23.932333] [drm] register mmio base: 0x50000000
[ 23.937023] [drm] register mmio size: 1048576
[ 23.948575] [drm] add ip block number 0 <nv_common>
[ 23.953520] [drm] add ip block number 1 <gmc_v10_0>
[ 23.958431] [drm] add ip block number 2 <navi10_ih>
[ 23.963367] [drm] add ip block number 3 <psp>
[ 23.967754] [drm] add ip block number 4 <smu>
[ 23.972142] [drm] add ip block number 5 <dm>
[ 23.976462] [drm] add ip block number 6 <gfx_v10_0>
[ 23.981373] [drm] add ip block number 7 <sdma_v5_2>
[ 23.986302] [drm] add ip block number 8 <vcn_v3_0>
[ 23.991265] amdgpu 000d:03:00.0: amdgpu: Fetched VBIOS from VFCT
[ 23.997342] amdgpu: ATOM BIOS: 113-24X46SHB1-D02
[ 24.002011] [drm] VCN(0) decode is enabled in VM mode
[ 24.007118] amdgpu 000d:03:00.0: amdgpu: Trusted Memory Zone (TMZ)
feature disabled as experimental (default)
[ 24.017125] amdgpu 000d:03:00.0: amdgpu: PCIE atomic ops is not supported
[ 24.026314] [drm] vm size is 262144 GB, 4 levels, block size is
9-bit, fragment size is 9-bit
[ 24.034973] amdgpu 000d:03:00.0: BAR 2: releasing [mem
0x340010000000-0x3400101fffff 64bit pref]
[ 24.043843] amdgpu 000d:03:00.0: BAR 0: releasing [mem
0x340000000000-0x34000fffffff 64bit pref]
[ 24.054541] pcieport 000d:02:00.0: BAR 15: releasing [mem
0x340000000000-0x340017ffffff 64bit pref]
[ 24.063994] pcieport 000d:01:00.0: BAR 15: releasing [mem
0x340000000000-0x340017ffffff 64bit pref]
[ 24.073130] pcieport 000d:00:01.0: BAR 15: releasing [mem
0x340000000000-0x340017ffffff 64bit pref]
[ 24.082245] pcieport 000d:00:01.0: bridge window [io
0x1000-0x0fff] to [bus 01-03] add_size 1000
[ 24.091333] pcieport 000d:00:01.0: BAR 15: assigned [mem
0x340000000000-0x34017fffffff 64bit pref]
[ 24.100395] pcieport 000d:00:01.0: BAR 13: no space for [io size 0x1000]
[ 24.107255] pcieport 000d:00:01.0: BAR 13: failed to assign [io size 0x1000]
[ 24.114472] pcieport 000d:00:01.0: BAR 13: no space for [io size 0x1000]
[ 24.121308] pcieport 000d:00:01.0: BAR 13: failed to assign [io size 0x1000]
[ 24.128523] pcieport 000d:01:00.0: BAR 15: assigned [mem
0x340000000000-0x34017fffffff 64bit pref]
[ 24.137599] pcieport 000d:02:00.0: BAR 15: assigned [mem
0x340000000000-0x34017fffffff 64bit pref]
[ 24.146660] amdgpu 000d:03:00.0: BAR 0: assigned [mem
0x340000000000-0x3400ffffffff 64bit pref]
[ 24.155457] amdgpu 000d:03:00.0: BAR 2: assigned [mem
0x340100000000-0x3401001fffff 64bit pref]
[ 24.164244] pcieport 000d:00:01.0: PCI bridge to [bus 01-03]
[ 24.169950] pcieport 000d:00:01.0: bridge window [mem
0x50000000-0x502fffff]
[ 24.177242] pcieport 000d:00:01.0: bridge window [mem
0x340000000000-0x34017fffffff 64bit pref]
[ 24.187449] pcieport 000d:01:00.0: PCI bridge to [bus 02-03]
[ 24.194289] pcieport 000d:01:00.0: bridge window [mem
0x50000000-0x501fffff]
[ 24.204225] pcieport 000d:01:00.0: bridge window [mem
0x340000000000-0x34017fffffff 64bit pref]
[ 24.214216] pcieport 000d:02:00.0: PCI bridge to [bus 03]
[ 24.220666] pcieport 000d:02:00.0: bridge window [mem
0x50000000-0x501fffff]
[ 24.228944] pcieport 000d:02:00.0: bridge window [mem
0x340000000000-0x34017fffffff 64bit pref]
[ 24.238884] amdgpu 000d:03:00.0: amdgpu: VRAM: 4080M
0x0000008000000000 - 0x00000080FEFFFFFF (4080M used)
[ 24.249475] amdgpu 000d:03:00.0: amdgpu: GART: 512M
0x0000000000000000 - 0x000000001FFFFFFF
[ 24.258839] amdgpu 000d:03:00.0: amdgpu: AGP: 267894784M
0x0000008400000000 - 0x0000FFFFFFFFFFFF
[ 24.268646] [drm] Detected VRAM RAM=4080M, BAR=4096M
[ 24.274620] [drm] RAM width 64bits GDDR6
[ 24.326191] [drm] amdgpu: 4080M of VRAM memory ready
[ 24.332137] [drm] amdgpu: 31878M of GTT memory ready.
[ 24.338485] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 24.346101] [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
[ 24.431009] amdgpu 000d:03:00.0: amdgpu: PSP runtime database doesn't exist
[ 24.439084] amdgpu 000d:03:00.0: amdgpu: PSP runtime database doesn't exist
[ 24.681141] ixgbe 0005:01:00.0: registered PHC device on enP5p1s0f0
[ 24.888758] ixgbe 0005:01:00.0 enP5p1s0f0: detected SFP+: 5
[ 25.042801] ixgbe 0005:01:00.0 enP5p1s0f0: NIC Link is Up 10 Gbps,
Flow Control: RX/TX
[ 25.516179] ixgbe 0005:01:00.1: registered PHC device on enP5p1s0f1
[ 25.718747] ixgbe 0005:01:00.1 enP5p1s0f1: detected SFP+: 6
[ 25.872810] ixgbe 0005:01:00.1 enP5p1s0f1: NIC Link is Up 10 Gbps,
Flow Control: RX/TX
[ 26.422200] IPv6: ADDRCONF(NETDEV_CHANGE): enP5p1s0f0: link becomes ready
[ 26.501022] IPv6: ADDRCONF(NETDEV_CHANGE): enP5p1s0f1: link becomes ready
[ 26.554181] br0: port 1(enP5p1s0f1) entered blocking state
[ 26.560834] br0: port 1(enP5p1s0f1) entered disabled state
[ 26.561733] amdgpu 000d:03:00.0: amdgpu: STB initialized to 2048 entries
[ 26.580331] device enP5p1s0f1 entered promiscuous mode
[ 26.586637] audit: type=1700 audit(1667665332.890:21):
dev=enP5p1s0f1 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0
ses=4294967295
[ 26.590559] br0: port 1(enP5p1s0f1) entered blocking state
[ 26.606149] br0: port 1(enP5p1s0f1) entered listening state
[ 26.623013] audit: type=1300 audit(1667665332.890:21):
arch=c00000b7 syscall=211 success=yes exit=40 a0=d a1=fffff4030ef0
a2=0 a3=0 items=0 ppid=1 pid=1022 auid=4294967295 uid=0 gid=0 euid=0
suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295
comm="NetworkManager" exe="/usr/sbin/NetworkManager" subj=kernel
key=(null)
[ 26.634704] [drm] Loading DMUB firmware via PSP: version=0x02020013
[ 26.652070] audit: type=1327 audit(1667665332.890:21):
proctitle=2F7573722F7362696E2F
4E6574776F726B4D616E6167657200
2D2D6465627567
[ 26.787231] [drm] use_doorbell being set to: [true]
[ 26.874022] [drm] Found VCN firmware Version ENC: 1.24 DEC: 2 VEP:
0 Revision: 0
[ 26.882692] amdgpu 000d:03:00.0: amdgpu: Will use PSP to load VCN firmware
[ 27.005497] [drm] reserve 0xa00000 from 0x8001000000 for PSP TMR
[ 27.166769] amdgpu 000d:03:00.0: amdgpu: RAS: optional ras ta ucode
is not available
[ 27.213836] amdgpu 000d:03:00.0: amdgpu: SECUREDISPLAY:
securedisplay ta ucode is not available
[ 27.223838] amdgpu 000d:03:00.0: amdgpu: smu driver if version =
0x0000000d, smu fw if version = 0x0000000f, smu fw program = 0,
version = 0x00491c00 (73.28.0)
[ 27.239284] amdgpu 000d:03:00.0: amdgpu: SMU driver if version not matched
[ 27.247346] amdgpu 000d:03:00.0: amdgpu: use vbios provided pptable
[ 27.306402] amdgpu 000d:03:00.0: amdgpu: SMU is initialized successfully!
[ 27.316825] [drm] Display Core initialized with v3.2.207!
[ 27.324625] [drm] DMUB hardware initialized: version=0x02020013
[ 27.751598] [drm] kiq ring mec 2 pipe 1 q 0
[ 28.087911] amdgpu 000d:03:00.0: [drm:amdgpu_ring_test_helper
[amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[ 28.099959] [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed
[ 28.108490] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* hw_init of
IP block <gfx_v10_0> failed -110
[ 28.119290] amdgpu 000d:03:00.0: amdgpu: amdgpu_device_ip_init failed
[ 28.126875] amdgpu 000d:03:00.0: amdgpu: Fatal error during GPU init
[ 28.134482] amdgpu 000d:03:00.0: amdgpu: amdgpu: finishing device.
[ 28.144167] [drm] DSC precompute is not needed.
[ 28.203259] amdgpu 000d:03:00.0: amdgpu: free PSP TMR buffer
[ 29.522686] amdgpu: probe of 000d:03:00.0 failed with error -110
[ 29.533000] INFO: trying to register non-static key.
[ 29.539074] The code is fine but needs lockdep annotation, or maybe
[ 29.546370] you didn't initialize this object before use?
[ 29.552748] turning off the locking correctness validator.
[ 29.559182] CPU: 27 PID: 926 Comm: systemd-udevd Not tainted
6.1.0-0.rc1.20221018gitbb1a1146467a.16.fc38.aarch64 #1
[ 29.570673] Hardware name: ADLINK AVA Developer Platform/AVA
Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308)
09/08/2022
[ 29.584253] Call trace:
[ 29.587659] dump_backtrace+0xe8/0x140
[ 29.592382] show_stack+0x20/0x50
[ 29.596699] dump_stack_lvl+0x88/0xb4
[ 29.601332] dump_stack+0x18/0x34
[ 29.605588] register_lock_class+0x470/0x4a0
[ 29.610800] __lock_acquire+0x68/0x9c0
[ 29.615509] lock_acquire.part.0+0xe0/0x214
[ 29.620674] lock_acquire+0xa8/0x20c
[ 29.625190] _raw_spin_lock+0x60/0xc4
[ 29.629761] drm_sched_fini+0x94/0xc0 [gpu_sched]
[ 29.635387] amdgpu_fence_driver_sw_fini+0x120/0x124 [amdgpu]
[ 29.642652] amdgpu_device_fini_sw+0x28/0x230 [amdgpu]
[ 29.649262] amdgpu_driver_release_kms+0x24/0x40 [amdgpu]
[ 29.656079] drm_dev_put.part.0+0x88/0xc0
[ 29.661004] devm_drm_dev_init_release+0x1c/0x30
[ 29.666601] devm_action_release+0x1c/0x2c
[ 29.671629] devres_release_all+0xb0/0x110
[ 29.676632] device_unbind_cleanup+0x20/0x70
[ 29.681798] really_probe+0x208/0x3e0
[ 29.686416] __driver_probe_device+0x84/0x190
[ 29.691718] driver_probe_device+0x44/0x120
[ 29.696839] __driver_attach+0x104/0x200
[ 29.701640] bus_for_each_dev+0x6c/0xac
[ 29.706336] driver_attach+0x2c/0x40
[ 29.710863] bus_add_driver+0x184/0x240
[ 29.715612] driver_register+0x80/0x13c
[ 29.720365] __pci_register_driver+0x68/0x80
[ 29.725501] amdgpu_init+0x78/0x1000 [amdgpu]
[ 29.731239] do_one_initcall+0x94/0x45c
[ 29.736003] do_init_module+0x50/0x204
[ 29.740657] load_module+0x9b8/0xb20
[ 29.745092] __do_sys_init_module+0x128/0x144
[ 29.750270] __arm64_sys_init_module+0x24/0x30
[ 29.755550] invoke_syscall+0x78/0x100
[ 29.760137] el0_svc_common.constprop.0+0x104/0x124
[ 29.765827] do_el0_svc+0x34/0x4c
[ 29.769887] el0_svc+0x50/0x140
[ 29.773762] el0t_64_sync_handler+0xf4/0x120
[ 29.778757] el0t_64_sync+0x190/0x194
[ 29.783216] Unable to handle kernel NULL pointer dereference at
virtual address 00000000000000d0
[ 29.792821] Mem abort info:
[ 29.796349] ESR = 0x0000000096000044
[ 29.800819] EC = 0x25: DABT (current EL), IL = 32 bits
[ 29.806926] SET = 0, FnV = 0
[ 29.810724] EA = 0, S1PTW = 0
[ 29.814598] FSC = 0x04: level 0 translation fault
[ 29.820167] Data abort info:
[ 29.823735] ISV = 0, ISS = 0x00000044
[ 29.828251] CM = 0, WnR = 1
[ 29.831909] user pgtable: 4k pages, 48-bit VAs, pgdp=000008002bb15000
[ 29.839085] [00000000000000d0] pgd=0000000000000000, p4d=0000000000000000
[ 29.846578] Internal error: Oops: 0000000096000044 [#1] SMP
[ 29.852825] Modules linked in: amdgpu(+) bridge raid1 stp llc video
gpu_sched drm_buddy crct10dif_ce polyval_ce polyval_generic ghash_ce
sbsa_gwdt drm_display_helper cec nvme ixgbe nvme_core igb ast
nvme_common drm_vram_helper drm_ttm_helper mdio ttm xgene_hwmon
gpio_dwapb onboard_usb_hub scsi_dh_rdac scsi_dh_emc scsi_dh_alua
ip6_tables ip_tables dm_multipath i2c_dev fuse
[ 29.887379] CPU: 27 PID: 926 Comm: systemd-udevd Not tainted
6.1.0-0.rc1.20221018gitbb1a1146467a.16.fc38.aarch64 #1
[ 29.898528] Hardware name: ADLINK AVA Developer Platform/AVA
Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308)
09/08/2022
[ 29.911770] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 29.919454] pc : drm_sched_fini+0xa0/0xc0 [gpu_sched]
[ 29.925225] lr : drm_sched_fini+0x94/0xc0 [gpu_sched]
[ 29.930979] sp : ffff80000cfc36c0
[ 29.934981] x29: ffff80000cfc36c0 x28: 0000000000000000 x27: 0000000000000000
[ 29.942812] x26: ffff80000cfc3c20 x25: 0000000000000000 x24: ffff07ffbb108258
[ 29.950635] x23: ffff07ffbb10c5f0 x22: ffff07ffbb10c740 x21: ffff07ffbb10c5e8
[ 29.958455] x20: 0000000000000001 x19: ffff07ffbb10c790 x18: 0000000000000002
[ 29.966263] x17: 0000000000000001 x16: 0000000000000004 x15: 0000000000000000
[ 29.974052] x14: 0000000000000000 x13: 0000000000000020 x12: 0000000000000000
[ 29.981825] x11: 00000000ffffbfff x10: ffff080f77ed9580 x9 : ffffd6c28065447c
[ 29.989600] x8 : ffff07ffbb10c758 x7 : c0000000ffffbfff x6 : 000000000015ffa8
[ 29.997381] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffd6c282744008
[ 30.005171] x2 : ffff314cb1337000 x1 : 0000000000000000 x0 : 0000000000000000
[ 30.005175] Call trace:
[ 30.005177] drm_sched_fini+0xa0/0xc0 [gpu_sched]
[ 30.005189] amdgpu_fence_driver_sw_fini+0x120/0x124 [amdgpu]
[ 30.032331] amdgpu_device_fini_sw+0x28/0x230 [amdgpu]
[ 30.040987] amdgpu_driver_release_kms+0x24/0x40 [amdgpu]
[ 30.047567] drm_dev_put.part.0+0x88/0xc0
[ 30.052257] devm_drm_dev_init_release+0x1c/0x30
[ 30.057552] devm_action_release+0x1c/0x2c
[ 30.062322] devres_release_all+0xb0/0x110
[ 30.067089] device_unbind_cleanup+0x20/0x70
[ 30.072027] really_probe+0x208/0x3e0
[ 30.076358] __driver_probe_device+0x84/0x190
[ 30.081383] driver_probe_device+0x44/0x120
[ 30.086235] __driver_attach+0x104/0x200
[ 30.090825] bus_for_each_dev+0x6c/0xac
[ 30.095329] driver_attach+0x2c/0x40
[ 30.099569] bus_add_driver+0x184/0x240
[ 30.104071] driver_register+0x80/0x13c
[ 30.108570] __pci_register_driver+0x68/0x80
[ 30.113508] amdgpu_init+0x78/0x1000 [amdgpu]
[ 30.119042] do_one_initcall+0x94/0x45c
[ 30.123548] do_init_module+0x50/0x204
[ 30.127967] load_module+0x9b8/0xb20
[ 30.132208] __do_sys_init_module+0x128/0x144
[ 30.137230] __arm64_sys_init_module+0x24/0x30
[ 30.142341] invoke_syscall+0x78/0x100
[ 30.146755] el0_svc_common.constprop.0+0x104/0x124
[ 30.152301] do_el0_svc+0x34/0x4c
[ 30.156283] el0_svc+0x50/0x140
[ 30.160091] el0t_64_sync_handler+0xf4/0x120
[ 30.165028] el0t_64_sync+0x190/0x194
[ 30.169353] Code: 94000de1 f9400261 eb13003f 540000a0 (39034034)
[ 30.176117] ---[ end trace 0000000000000000 ]---
More information about the dri-devel
mailing list