[PATCH V4 17/17] drm/amd/pm: unified lock protections in amdgpu_dpm.c

Christian König christian.koenig at amd.com
Fri Apr 1 08:56:22 UTC 2022


Hi Arthur,

apart from blacklisting amdgpu I generally advise to SSH from another 
computer into the affected system if you have a problem like this.

Additionally to what Evan said I suggest that you enable 
CONFIG_LOCKDEP_SUPPORT in your kernel configuration. This will yield 
warnings in your system log in case of deadlocks or accidentally 
forgetting to unlock something.

Regards,
Christian.

Am 01.04.22 um 10:49 schrieb Arthur Marsh:
> Hi Evan, this is what was logged (filtering for drm and amdgpu) when I
> blacklisted amdgpu then manually did:
>
> modprobe amdgpu si_support=1 gpu_recovery=1
>
> Apr  1 18:31:14 am64 kernel: [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.17.0+ root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro amdgpu.audio=1 amdgpu.si_support=1 radeon.si_support=0 page_owner=on amdgpu.gpu_recovery=1 udev.log-priority=info rd.udev.log-priority=info
> Apr  1 18:31:14 am64 kernel: [    0.059624] Kernel command line: BOOT_IMAGE=/vmlinuz-5.17.0+ root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro amdgpu.audio=1 amdgpu.si_support=1 radeon.si_support=0 page_owner=on amdgpu.gpu_recovery=1 udev.log-priority=info rd.udev.log-priority=info
>
> Apr  1 18:33:43 am64 kernel: [  245.724485] ACPI: bus type drm_connector registered
> Apr  1 18:33:44 am64 kernel: [  245.945020] [drm] amdgpu kernel modesetting enabled.
> Apr  1 18:33:44 am64 kernel: [  245.945140] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
> Apr  1 18:33:44 am64 kernel: [  245.946413] [drm] initializing kernel modesetting (VERDE 0x1002:0x682B 0x1458:0x22CA 0x87).
> Apr  1 18:33:44 am64 kernel: [  245.946423] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
> Apr  1 18:33:44 am64 kernel: [  245.946448] [drm] register mmio base: 0xFE8C0000
> Apr  1 18:33:44 am64 kernel: [  245.946451] [drm] register mmio size: 262144
> Apr  1 18:33:44 am64 kernel: [  245.946642] [drm] add ip block number 0 <si_common>
> Apr  1 18:33:44 am64 kernel: [  245.946657] [drm] add ip block number 1 <gmc_v6_0>
> Apr  1 18:33:44 am64 kernel: [  245.946660] [drm] add ip block number 2 <si_ih>
> Apr  1 18:33:44 am64 kernel: [  245.946663] [drm] add ip block number 3 <gfx_v6_0>
> Apr  1 18:33:44 am64 kernel: [  245.946666] [drm] add ip block number 4 <si_dma>
> Apr  1 18:33:44 am64 kernel: [  245.946668] [drm] add ip block number 5 <si_dpm>
> Apr  1 18:33:44 am64 kernel: [  245.946671] [drm] add ip block number 6 <dce_v6_0>
> Apr  1 18:33:44 am64 kernel: [  245.946674] [drm] add ip block number 7 <uvd_v3_1>
> Apr  1 18:33:44 am64 kernel: [  245.990113] [drm] BIOS signature incorrect 20 7
> Apr  1 18:33:44 am64 kernel: [  245.990146] amdgpu 0000:01:00.0: No more image in the PCI ROM
> Apr  1 18:33:44 am64 kernel: [  245.991510] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
> Apr  1 18:33:44 am64 kernel: [  245.991516] amdgpu: ATOM BIOS: xxx-xxx-xxx
> Apr  1 18:33:44 am64 kernel: [  245.991539] amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
> Apr  1 18:33:44 am64 kernel: [  245.991841] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
> Apr  1 18:33:44 am64 kernel: [  246.045705] amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
> Apr  1 18:33:44 am64 kernel: [  246.045719] amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
> Apr  1 18:33:44 am64 kernel: [  246.045736] [drm] Detected VRAM RAM=2048M, BAR=256M
> Apr  1 18:33:44 am64 kernel: [  246.045739] [drm] RAM width 128bits DDR3
> Apr  1 18:33:44 am64 kernel: [  246.045825] [drm] amdgpu: 2048M of VRAM memory ready
> Apr  1 18:33:44 am64 kernel: [  246.045829] [drm] amdgpu: 3072M of GTT memory ready.
> Apr  1 18:33:44 am64 kernel: [  246.045854] [drm] GART: num cpu pages 262144, num gpu pages 262144
> Apr  1 18:33:44 am64 kernel: [  246.046180] amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400900000).
> Apr  1 18:33:44 am64 kernel: [  246.084159] [drm] Internal thermal controller with fan control
> Apr  1 18:33:44 am64 kernel: [  246.084180] [drm] amdgpu: dpm initialized
> Apr  1 18:33:44 am64 kernel: [  246.084264] [drm] AMDGPU Display Connectors
> Apr  1 18:33:44 am64 kernel: [  246.084268] [drm] Connector 0:
> Apr  1 18:33:44 am64 kernel: [  246.084270] [drm]   HDMI-A-1
> Apr  1 18:33:44 am64 kernel: [  246.084272] [drm]   HPD1
> Apr  1 18:33:44 am64 kernel: [  246.084274] [drm]   DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
> Apr  1 18:33:44 am64 kernel: [  246.084279] [drm]   Encoders:
> Apr  1 18:33:44 am64 kernel: [  246.084281] [drm]     DFP1: INTERNAL_UNIPHY
> Apr  1 18:33:44 am64 kernel: [  246.084283] [drm] Connector 1:
> Apr  1 18:33:44 am64 kernel: [  246.084285] [drm]   DVI-D-1
> Apr  1 18:33:44 am64 kernel: [  246.084287] [drm]   HPD2
> Apr  1 18:33:44 am64 kernel: [  246.084289] [drm]   DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
> Apr  1 18:33:44 am64 kernel: [  246.084293] [drm]   Encoders:
> Apr  1 18:33:44 am64 kernel: [  246.084295] [drm]     DFP2: INTERNAL_UNIPHY
> Apr  1 18:33:44 am64 kernel: [  246.084297] [drm] Connector 2:
> Apr  1 18:33:44 am64 kernel: [  246.084299] [drm]   VGA-1
> Apr  1 18:33:44 am64 kernel: [  246.084301] [drm]   DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973 0x1973
> Apr  1 18:33:44 am64 kernel: [  246.084305] [drm]   Encoders:
> Apr  1 18:33:44 am64 kernel: [  246.084307] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
> Apr  1 18:33:44 am64 kernel: [  246.135615] [drm] Found UVD firmware Version: 64.0 Family ID: 13
> Apr  1 18:33:44 am64 kernel: [  246.137371] [drm] PCIE gen 2 link speeds already enabled
> Apr  1 18:33:44 am64 kernel: [  246.674277] [drm] UVD initialized successfully.
> Apr  1 18:33:44 am64 kernel: [  246.674849] amdgpu 0000:01:00.0: amdgpu: SE 1, SH per SE 2, CU per SH 5, active_cu_number 8
> Apr  1 18:33:45 am64 kernel: [  247.008964] [drm] Initialized amdgpu 3.46.0 20150101 for 0000:01:00.0 on minor 0
> Apr  1 18:33:45 am64 kernel: [  247.068412] fbcon: amdgpudrmfb (fb0) is primary device
>
> The monitor still went blank but the magic sysreq sync and boot worked,
> allowing capture of the above log but nothing after the line above.
>
> Regards,
>
> Arthur Marsh.



More information about the amd-gfx mailing list