powerplay change breaks driver
Tom St Denis
tom.stdenis at amd.com
Mon Sep 25 18:26:24 UTC 2017
To narrow things down it's likely something in the CZ code paths as it
still crashes with the Polaris10 removed.
Tom
On 25/09/17 01:55 PM, Tom St Denis wrote:
> This change
>
> commit f96306921d5e346ebc82c7c51ae6e0b736e5b425
> Author: Rex Zhu <Rex.Zhu at amd.com>
> Date: Wed Sep 20 14:44:55 2017 +0800
>
> drm/amd/powerplay: refine powerplay code.
>
> delete struct smumgr, put smu backend function table
> in struct hwmgr
>
> Change-Id: I7b73ef062b147b4e7199105a3c101f6c8038cc57
> Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
> Signed-off-by: Rex Zhu <Rex.Zhu at amd.com>
>
>
> Results in this dmesg log error messages on my Carrizo + Polaris10 setup:
>
> [ 24.237785] [drm] amdgpu kernel modesetting enabled.
> [ 24.237814] checking generic (c0000000 7e9000) vs hw (e0000000 10000000)
> [ 24.237864] amdgpu 0000:00:01.0: enabling device (0006 -> 0007)
> [ 24.238366] [drm] initializing kernel modesetting (CARRIZO
> 0x1002:0x9874 0x1002:0x1E10 0xE1).
> [ 24.238394] [drm] register mmio base: 0xD1300000
> [ 24.238394] [drm] register mmio size: 262144
> [ 24.238463] ACPI Error: [\_SB_.ALIB] Namespace lookup failure,
> AE_NOT_FOUND (20170531/psargs-364)
> [ 24.238497] ACPI Error: Method parse/execution failed
> \_SB.PCI0.VGA.ATC0, AE_NOT_FOUND (20170531/psparse-550)
> [ 24.238528] ACPI Error: Method parse/execution failed
> \_SB.PCI0.VGA.ATCS, AE_NOT_FOUND (20170531/psparse-550)
> [ 24.238558] [drm] UVD is enabled in physical mode
> [ 24.238561] [drm] VCE enabled in physical mode
> [ 24.250365] ATOM BIOS: 109-C95010-001
> [ 24.250381] [drm] GPU post is not needed
> [ 24.250407] [drm] vm size is 64 GB, block size is 13-bit, fragment
> size is 9-bit
> [ 24.250412] amdgpu 0000:00:01.0: VRAM: 512M 0x000000F400000000 -
> 0x000000F41FFFFFFF (512M used)
> [ 24.250413] amdgpu 0000:00:01.0: GTT: 1024M 0x0000000000000000 -
> 0x000000003FFFFFFF
> [ 24.250420] [drm] Detected VRAM RAM=512M, BAR=512M
> [ 24.250421] [drm] RAM width 64bits UNKNOWN
> [ 24.250795] [TTM] Zone kernel: Available graphics memory: 3846244 kiB
> [ 24.250797] [TTM] Zone dma32: Available graphics memory: 2097152 kiB
> [ 24.250797] [TTM] Initializing pool allocator
> [ 24.250801] [TTM] Initializing DMA pool allocator
> [ 24.250844] [drm] amdgpu: 512M of VRAM memory ready
> [ 24.250845] [drm] amdgpu: 3072M of GTT memory ready.
> [ 24.250860] [drm] GART: num cpu pages 262144, num gpu pages 262144
> [ 24.250970] [drm] PCIE GART of 1024M enabled (table at
> 0x000000F400040000).
> [ 24.251017] amdgpu 0000:00:01.0: amdgpu: using MSI.
> [ 24.251034] [drm] amdgpu: irq initialized.
> [ 24.251037] amdgpu: [powerplay] amdgpu: powerplay sw initialized
> [ 24.254140] [drm] Chained IB support enabled!
> [ 24.257056] amdgpu 0000:00:01.0: fence driver on ring 0 use gpu addr
> 0x0000000000400080, cpu addr 0xffffc9000105d080
> [ 24.257196] amdgpu 0000:00:01.0: fence driver on ring 1 use gpu addr
> 0x0000000000400100, cpu addr 0xffffc9000105d100
> [ 24.257922] amdgpu 0000:00:01.0: fence driver on ring 2 use gpu addr
> 0x0000000000400180, cpu addr 0xffffc9000105d180
> [ 24.258053] amdgpu 0000:00:01.0: fence driver on ring 3 use gpu addr
> 0x0000000000400200, cpu addr 0xffffc9000105d200
> [ 24.258115] amdgpu 0000:00:01.0: fence driver on ring 4 use gpu addr
> 0x0000000000400280, cpu addr 0xffffc9000105d280
> [ 24.258146] amdgpu 0000:00:01.0: fence driver on ring 5 use gpu addr
> 0x0000000000400300, cpu addr 0xffffc9000105d300
> [ 24.258353] amdgpu 0000:00:01.0: fence driver on ring 6 use gpu addr
> 0x0000000000400380, cpu addr 0xffffc9000105d380
> [ 24.258426] amdgpu 0000:00:01.0: fence driver on ring 7 use gpu addr
> 0x0000000000400400, cpu addr 0xffffc9000105d400
> [ 24.258484] amdgpu 0000:00:01.0: fence driver on ring 8 use gpu addr
> 0x0000000000400480, cpu addr 0xffffc9000105d480
> [ 24.258528] amdgpu 0000:00:01.0: fence driver on ring 9 use gpu addr
> 0x0000000000400520, cpu addr 0xffffc9000105d520
> [ 24.260159] amdgpu 0000:00:01.0: fence driver on ring 10 use gpu addr
> 0x00000000004005a0, cpu addr 0xffffc9000105d5a0
> [ 24.260508] amdgpu 0000:00:01.0: fence driver on ring 11 use gpu addr
> 0x0000000000400620, cpu addr 0xffffc9000105d620
> [ 24.261591] [drm] Found UVD firmware Version: 1.91 Family ID: 11
> [ 24.262451] amdgpu 0000:00:01.0: fence driver on ring 12 use gpu addr
> 0x000000f400296560, cpu addr 0xffffc90003442560
> [ 24.263350] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
> [ 24.263819] amdgpu 0000:00:01.0: fence driver on ring 13 use gpu addr
> 0x0000000000400720, cpu addr 0xffffc9000105d720
> [ 24.263921] amdgpu 0000:00:01.0: fence driver on ring 14 use gpu addr
> 0x00000000004007a0, cpu addr 0xffffc9000105d7a0
> [ 24.264438] amdgpu: [powerplay] Fail to get clock table from SMU!
> [ 24.264440] amdgpu: [powerplay] amdgpu: powerplay initialization failed
> [ 24.264467] [drm] DAL is enabled
> [ 24.264835] [drm] DC: create_links: connectors_num: physical:3,
> virtual:0
> [ 24.264839] [drm] Connector[0] description:signal 32
> [ 24.264842] [drm] Using channel: CHANNEL_ID_DDC1 [1]
> [ 24.264851] [drm] Connector[1] description:signal 4
> [ 24.264853] [drm] Using channel: CHANNEL_ID_DDC2 [2]
> [ 24.264860] [drm] Connector[2] description:signal 4
> [ 24.264862] [drm] Using channel: CHANNEL_ID_DDC3 [3]
> [ 24.564284] [drm:hwss_wait_for_blank_complete [amdgpu]] *ERROR* DC:
> failed to blank crtc!
> [ 24.564329] [drm] Display Core initialized
> [ 24.564332] [drm] amdgpu: freesync_module init done ffff88021048afe0.
> [ 24.564564] [drm] link=0, dc_sink_in= (null) is now
> Disconnected
> [ 24.564565] [drm] DCHPD: connector_id=0: dc_sink didn't change.
> [ 24.564624] [drm] link=1, dc_sink_in= (null) is now
> Disconnected
> [ 24.564624] [drm] DCHPD: connector_id=1: dc_sink didn't change.
> [ 24.564738] [drm] link=2, dc_sink_in= (null) is now
> Disconnected
> [ 24.564739] [drm] DCHPD: connector_id=2: dc_sink didn't change.
> [ 24.564751] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [ 24.564752] [drm] Driver supports precise vblank timestamp query.
> [ 24.564752] [drm] KMS initialized.
> [ 24.566110] [drm] ring test on 0 succeeded in 13 usecs
> [ 24.755765] [drm:gfx_v8_0_kiq_resume [amdgpu]] *ERROR* KCQ enable
> failed (scratch(0xC040)=0xCAFEDEAD)
> [ 24.755819] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP
> block <gfx_v8_0> failed -22
> [ 24.755839] amdgpu 0000:00:01.0: amdgpu_init failed
> [ 24.756271] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [ 24.756302] IP: (null)
> [ 24.756312] PGD 2134b3067
> [ 24.756312] P4D 2134b3067
> [ 24.756320] PUD 0
>
> [ 24.756340] Oops: 0010 [#1] SMP
> [ 24.756349] Modules linked in: amdgpu(+) chash ttm ax88179_178a
> usbnet xhci_pci xhci_hcd efivarfs
> [ 24.756380] CPU: 3 PID: 3021 Comm: modprobe Not tainted 4.13.0-rc5+ #33
> [ 24.756396] Hardware name: AMD Myrtle/Myrtle, BIOS TMY1100A 03/23/2016
> [ 24.756413] task: ffff8802132744c0 task.stack: ffffc90000fd0000
> [ 24.756427] RIP: 0010: (null)
> [ 24.756437] RSP: 0018:ffffc90000fd3908 EFLAGS: 00010202
> [ 24.756450] RAX: ffff88021048a460 RBX: ffff8802100258a0 RCX:
> 000000018020000d
> [ 24.756466] RDX: 000000018020000e RSI: 0000000000005c02 RDI:
> ffff88021048a5a0
> [ 24.756482] RBP: ffffc90000fd3928 R08: ffff880210f9e580 R09:
> 000000018020000d
> [ 24.756499] R10: ffffc90000fd3948 R11: ffffea0008525e00 R12:
> 0000000000005c02
> [ 24.756516] R13: ffff88021365b690 R14: ffff880211db0040 R15:
> ffff880211db2f30
> [ 24.756534] FS: 00007ffa8be38700(0000) GS:ffff88021ed80000(0000)
> knlGS:0000000000000000
> [ 24.756554] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 24.756569] CR2: 0000000000000000 CR3: 0000000210030000 CR4:
> 00000000001406e0
> [ 24.756586] Call Trace:
> [ 24.756745] ? destroy+0x31/0x100 [amdgpu]
> [ 24.756822] dal_i2caux_destruct+0x5d/0x90 [amdgpu]
> [ 24.756875] destroy+0x15/0x30 [amdgpu]
> [ 24.756925] dal_i2caux_destroy+0x1b/0x30 [amdgpu]
> [ 24.756977] destruct+0x90/0x140 [amdgpu]
> [ 24.757028] dc_destroy+0x10/0x30 [amdgpu]
> [ 24.757083] amdgpu_dm_fini+0x62/0x70 [amdgpu]
> [ 24.757137] dm_hw_fini+0x1d/0x30 [amdgpu]
> [ 24.757183] amdgpu_fini+0xe8/0x330 [amdgpu]
> [ 24.757229] amdgpu_device_init+0xe5a/0x1560 [amdgpu]
> [ 24.757245] ? kmalloc_order_trace+0x29/0xd0
> [ 24.757290] ? amdgpu_driver_load_kms+0x53/0x200 [amdgpu]
> [ 24.757338] amdgpu_driver_load_kms+0x78/0x200 [amdgpu]
> [ 24.757353] drm_dev_register+0x141/0x1d0
> [ 24.757393] amdgpu_pci_probe+0x113/0x140 [amdgpu]
> [ 24.757406] local_pci_probe+0x40/0xa0
> [ 24.757416] pci_device_probe+0xaa/0x130
> [ 24.757426] driver_probe_device+0x23e/0x2d0
> [ 24.757437] __driver_attach+0x96/0xa0
> [ 24.757446] ? driver_probe_device+0x2d0/0x2d0
> [ 24.757457] bus_for_each_dev+0x5b/0x90
> [ 24.757467] driver_attach+0x19/0x20
> [ 24.757476] bus_add_driver+0x11c/0x220
> [ 24.757485] driver_register+0x5b/0xd0
> [ 24.757495] __pci_register_driver+0x47/0x50
> [ 24.757532] amdgpu_init+0x88/0x9b [amdgpu]
> [ 24.757544] ? 0xffffffffa030a000
> [ 24.757554] do_one_initcall+0x3e/0x160
> [ 24.757566] ? __vunmap+0x7c/0xb0
> [ 24.757577] ? kfree+0x147/0x160
> [ 24.757587] ? kmem_cache_alloc_trace+0x33/0x150
> [ 24.757602] do_init_module+0x5a/0x1f1
> [ 24.757614] load_module+0x2329/0x28d0
> [ 24.758259] ? kernel_read_file+0x19e/0x1c0
> [ 24.758898] SYSC_finit_module+0xba/0xc0
> [ 24.759524] ? SYSC_finit_module+0xba/0xc0
> [ 24.760206] SyS_finit_module+0x9/0x10
> [ 24.760835] entry_SYSCALL_64_fastpath+0x13/0x94
> [ 24.761450] RIP: 0033:0x7ffa8b310219
> [ 24.762137] RSP: 002b:00007ffe64b86b18 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000139
> [ 24.762851] RAX: ffffffffffffffda RBX: 00000055ee325090 RCX:
> 00007ffa8b310219
> [ 24.763487] RDX: 0000000000000000 RSI: 00000055edf2d2a6 RDI:
> 0000000000000005
> [ 24.764116] RBP: 00000055ee326f50 R08: 0000000000000000 R09:
> 0000000000000000
> [ 24.764716] R10: 0000000000000005 R11: 0000000000000246 R12:
> 00000055ee3252f0
> [ 24.765298] R13: 00007ffe64b86ad8 R14: 00007ffe64b86ae0 R15:
> 0000000000000000
> [ 24.765878] Code: Bad RIP value.
> [ 24.766464] RIP: (null) RSP: ffffc90000fd3908
> [ 24.767036] CR2: 0000000000000000
> [ 24.767717] ---[ end trace 636f871b29b747e7 ]---
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list