After Vega 56/64 GPU hang I unable reboot system
Mikhail Gavrilov
mikhail.v.gavrilov at gmail.com
Thu Dec 20 16:07:43 UTC 2018
On Thu, 20 Dec 2018 at 19:19, StDenis, Tom <Tom.StDenis at amd.com> wrote:
>
> Ya I was right. With a plain build I can access the files just fine.
>
>
>
> I did manage to get into a weird shell where I couldn't cat
> amdgpu_gca_config from bash though after a reboot (had updates pending)
> it works fine.
>
> If you can't cat those files then neither can umr.
>
> So NOTABUG :-)
>
I am very happy for you. But what about me?
I don't have idea how make this files available on my system.
And of course I tried reboot and try again cat amdgpu_gca_config
several times but all times without success.
Also I note that not all files not permitted for read from
/sys/kernel/debug/dri/0/*
I was able to dump contents of some files in debugfs.txt (see attachments)
List of available for readind files:
amdgpu_evict_gtt
amdgpu_evict_vram
amdgpu_fence_info
amdgpu_firmware_info
amdgpu_gds_mm
amdgpu_gem_info
amdgpu_gpu_recover
amdgpu_gtt_mm
amdgpu_gws_mm
amdgpu_oa_mm
amdgpu_pm_info
amdgpu_sa_info
amdgpu_test_ib
amdgpu_vbios
amdgpu_vram_mm
clients
framebuffer
gem_names
internal_clients
name
state
ttm_page_pool
May some kernel options restrict access for files in debugfs (for
example to amdgpu_gca_config)?
If yes on which options should I pay attention?
I have no more ideas. I tried everything.
--
Best Regards,
Mike Gavrilov.
-------------- next part --------------
# head /sys/kernel/debug/dri/0/*
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_dm_dtn_log' for reading: Operation not permitted
==> /sys/kernel/debug/dri/0/amdgpu_evict_gtt <==
(0)
==> /sys/kernel/debug/dri/0/amdgpu_evict_vram <==
(0)
==> /sys/kernel/debug/dri/0/amdgpu_fence_info <==
--- ring 0 (gfx) ---
Last signaled fence 0x00000216
Last emitted 0x00000216
Last preempted 0x00000000
Last reset 0x00000000
Last both 0x00000000
--- ring 1 (comp_1.0.0) ---
Last signaled fence 0x00000009
Last emitted 0x00000009
--- ring 2 (comp_1.1.0) ---
==> /sys/kernel/debug/dri/0/amdgpu_firmware_info <==
VCE feature version: 0, firmware version: 0x37030400
UVD feature version: 0, firmware version: 0x01571100
MC feature version: 0, firmware version: 0x00000000
ME feature version: 40, firmware version: 0x00000099
PFP feature version: 40, firmware version: 0x000000ae
CE feature version: 40, firmware version: 0x0000004d
RLC feature version: 0, firmware version: 0x00000058
RLC SRLC feature version: 0, firmware version: 0x00000000
RLC SRLG feature version: 0, firmware version: 0x00000000
RLC SRLS feature version: 0, firmware version: 0x00000000
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_gca_config' for reading: Operation not permitted
==> /sys/kernel/debug/dri/0/amdgpu_gds_mm <==
0x0000000000000000-0x0000000000001000: 4096: used
0x0000000000001000-0x0000000000010000: 61440: free
total: 65536, used 4096 free 61440
==> /sys/kernel/debug/dri/0/amdgpu_gem_info <==
pid 2219 command Xwayland:
0x00000001: 131072 byte CPU CPU_ACCESS_REQUIRED CPU_GTT_USWC
0x00000002: 4096 byte CPU CPU_ACCESS_REQUIRED
0x00000003: 131072 byte CPU CPU_ACCESS_REQUIRED CPU_GTT_USWC
0x00000004: 131072 byte CPU CPU_GTT_USWC
0x00000005: 131072 byte CPU CPU_ACCESS_REQUIRED CPU_GTT_USWC
pid 2219 command Xwayland:
0x00000001: 131072 byte CPU CPU_ACCESS_REQUIRED CPU_GTT_USWC
0x00000002: 4096 byte CPU CPU_ACCESS_REQUIRED
0x00000003: 131072 byte CPU CPU_ACCESS_REQUIRED CPU_GTT_USWC
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_gpr' for reading: Operation not permitted
==> /sys/kernel/debug/dri/0/amdgpu_gpu_recover <==
gpu recover
==> /sys/kernel/debug/dri/0/amdgpu_gtt_mm <==
0x0000000000000400-0x0000000000000401: 1: used
0x0000000000000401-0x0000000000000405: 4: used
0x0000000000000405-0x0000000000000447: 66: used
0x0000000000000447-0x0000000000000449: 2: used
0x0000000000000449-0x000000000000044b: 2: used
0x000000000000044b-0x000000000000044d: 2: used
0x000000000000044d-0x000000000000044f: 2: used
0x000000000000044f-0x0000000000000451: 2: used
0x0000000000000451-0x0000000000000453: 2: used
0x0000000000000453-0x0000000000000455: 2: used
==> /sys/kernel/debug/dri/0/amdgpu_gws_mm <==
0x0000000000000000-0x0000000000000004: 4: used
0x0000000000000004-0x0000000000000040: 60: free
total: 64, used 4 free 60
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_iomem' for reading: Operation not permitted
==> /sys/kernel/debug/dri/0/amdgpu_oa_mm <==
0x0000000000000000-0x0000000000000004: 4: used
0x0000000000000004-0x0000000000000010: 12: free
total: 16, used 4 free 12
==> /sys/kernel/debug/dri/0/amdgpu_pm_info <==
Clock Gating Flags Mask: 0x888200
Graphics Medium Grain Clock Gating: Off
Graphics Medium Grain memory Light Sleep: Off
Graphics Coarse Grain Clock Gating: Off
Graphics Coarse Grain memory Light Sleep: Off
Graphics Coarse Grain Tree Shader Clock Gating: Off
Graphics Coarse Grain Tree Shader Light Sleep: Off
Graphics Command Processor Light Sleep: Off
Graphics Run List Controller Light Sleep: Off
Graphics 3D Coarse Grain Clock Gating: Off
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_regs' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_regs_didt' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_regs_pcie' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_regs_smc' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_comp_1.0.0' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_comp_1.0.1' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_comp_1.1.0' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_comp_1.1.1' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_comp_1.2.0' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_comp_1.2.1' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_comp_1.3.0' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_comp_1.3.1' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_gfx' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_kiq_2.1.0' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_sdma0' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_sdma1' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_uvd<0>' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_uvd_enc0<0>' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_uvd_enc1<0>' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_vce0' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_vce1' for reading: Operation not permitted
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_ring_vce2' for reading: Operation not permitted
==> /sys/kernel/debug/dri/0/amdgpu_sa_info <==
[0x0000bca000 0x0000bcb020] size 4128 protected by 0x00000863 on context 47
[0x0000bcb100 0x0000bcb120] size 32 protected by 0x00000864 on context 47
[0x0000bcb200 0x0000bcc220] size 4128 protected by 0x00000865 on context 47
[0x0000bcc300 0x0000bcc320] size 32 protected by 0x00000866 on context 47
[0x0000bcc400 0x0000bcd420] size 4128 protected by 0x00000867 on context 47
[0x0000bcd500 0x0000bcd520] size 32 protected by 0x00000868 on context 47
[0x0000bcd600 0x0000bce620] size 4128 protected by 0x00000869 on context 47
[0x0000bce700 0x0000bce720] size 32 protected by 0x0000086a on context 47
[0x0000bce800 0x0000bcf820] size 4128 protected by 0x0000086b on context 47
[0x0000bcf900 0x0000bcf920] size 32 protected by 0x0000086c on context 47
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_sensors' for reading: Operation not permitted
==> /sys/kernel/debug/dri/0/amdgpu_test_ib <==
run ib test:
ib ring tests passed.
==> /sys/kernel/debug/dri/0/amdgpu_vbios <==
Uªwéë À œ IBMÿ¬Š 761295520 r 10/26/17 23:36 6 éÙ éã H €è € € € ô8ŠPã(D zÁË8žòÿÿ Œ¯ høªñ 0U 8ˆo ¸ƒ0‚¯:„ `Àû‹B ‚€Í12 )@ h " 2 ‡ X# à ùI" 0$0 €^ p e Àè¢ øÐ� xxx-xxx-xxx VEGA10 PCI_EXPRESS HBM2
GV-RXVEGA64GAMING OC-8GD/F1/062E
(C) 1988-2010, Advanced Micro Devices, Inc. ATOMBIOSBK-AMD VER016.001.001.000.000000 RVG64GO.F1 1475497 400795 GBT_VEGA10_D05001_MBA_A1_HBM_8GB_GAOC\config.h �( ATOM ÀëÁl àX#œø�žŽ › PCIRh w AMD ATOMBIOS ·-š× fPfQfRfSfUfVfW£ Œ ² èÕ(Àu¢ f_f^f]f[fZfYfXËèï)èÁ'2Ò�>r‰Uè}è=è—èèÔ$Àtè9 è«Oè<è Pè© ´€è»'ŠÇfÁàŠã° f£¬Šè¬èOèÊ)f_f^f]f[fZfYfXË.‹ ƒ>ý u‰û.ŽóœúfÇeð ðÇ@ ‰B Ç´‰¶Ç| ŽX‰~ Ç¡\‰Ç¨ÀR‰ª.Žû‹Ã£ÂR£ÒR£äR�ÃPMIDòB ° ¸ À » è¢
fÁ裢» è•
�>rf‰Eà èø(€ü uè= ë€üOuèå=ëè:Cë´èÿ(ÏèÖ(è ë´èñ(ËèÈ(€üOuè¿=ëèCë´èÙ(ËfPfQfRfSfUfVfW<u).¡ fÁà.¡ �6jŠ<€ï0³‹ì‰^‰FfÁè‰F é(<uèl&è?&.¡ ‹ì‰V‰F ‰^é<u2èÉfÑà‹ì‰F» èÓˆFè½f‰F» èÃ3Àf‰F .‹¢‰VéÖ <u$
Ûu¹€ » ‹ì‰^‰Né½ ŠÇèÐ è4„© é® <uèrtèã
èJè"è83É‹È‹ì‰Né� <uE
ÿuèù‹ì‰Fèo‰Nëuè=uh» èr'fÀt]#ÉtYè¾ èatN¾ °@è°è‹ì‰FëD<‚u€ûu
ÿu
è‹ì‰Fë-èÓ
të&<Žu€ÿt€ÿu€Áè,ë‹ìÆFë‹ìÆFë2ä‹ìˆff_f^f]f[fZfYfXÃQŠÈ¸ ÓàYÃèr'ÃÃPQ°¶æC°3æB°æBäaŠàæaŠÄ¹È èz#æaYXà V€> €vÆ €Š Áá üÆ! 3ö2ä¬àâûöÔþĈ&! ^à WÀuè‚ t%è$ öÃtóë‹Èè´!#Átè öÃt
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_vram' for reading: Operation not permitted
==> /sys/kernel/debug/dri/0/amdgpu_vram_mm <==
0x0000000000000000-0x0000000000000900: 2304: used
0x0000000000000900-0x0000000000000a00: 256: used
0x0000000000000a00-0x0000000000000a01: 1: used
0x0000000000000a01-0x0000000000000a02: 1: used
0x0000000000000a02-0x0000000000000a03: 1: used
0x0000000000000a03-0x0000000000000a04: 1: used
0x0000000000000a04-0x0000000000000a05: 1: used
0x0000000000000a05-0x0000000000000a06: 1: used
0x0000000000000a06-0x0000000000000a1f: 25: used
0x0000000000000a1f-0x0000000000000a20: 1: used
head: cannot open '/sys/kernel/debug/dri/0/amdgpu_wave' for reading: Operation not permitted
==> /sys/kernel/debug/dri/0/clients <==
command pid dev master a uid magic
systemd-logind 998 0 y y 0 0
Xwayland 2219 0 n y 1000 1
Xwayland 2219 0 n y 1000 2
Xwayland 2219 0 n y 1000 3
Xwayland 2219 0 n y 1000 4
Xwayland 2219 0 n y 1000 5
Xwayland 2219 0 n y 1000 6
Xwayland 2219 0 n y 1000 7
Xwayland 2219 0 n y 1000 8
==> /sys/kernel/debug/dri/0/crtc-0 <==
head: error reading '/sys/kernel/debug/dri/0/crtc-0': Is a directory
==> /sys/kernel/debug/dri/0/crtc-1 <==
head: error reading '/sys/kernel/debug/dri/0/crtc-1': Is a directory
==> /sys/kernel/debug/dri/0/crtc-2 <==
head: error reading '/sys/kernel/debug/dri/0/crtc-2': Is a directory
==> /sys/kernel/debug/dri/0/crtc-3 <==
head: error reading '/sys/kernel/debug/dri/0/crtc-3': Is a directory
==> /sys/kernel/debug/dri/0/crtc-4 <==
head: error reading '/sys/kernel/debug/dri/0/crtc-4': Is a directory
==> /sys/kernel/debug/dri/0/crtc-5 <==
head: error reading '/sys/kernel/debug/dri/0/crtc-5': Is a directory
==> /sys/kernel/debug/dri/0/DP-1 <==
head: error reading '/sys/kernel/debug/dri/0/DP-1': Is a directory
==> /sys/kernel/debug/dri/0/DP-2 <==
head: error reading '/sys/kernel/debug/dri/0/DP-2': Is a directory
==> /sys/kernel/debug/dri/0/DP-3 <==
head: error reading '/sys/kernel/debug/dri/0/DP-3': Is a directory
==> /sys/kernel/debug/dri/0/framebuffer <==
framebuffer[59]:
allocated by = gnome-shell
refcount=2
format=XR24 little-endian (0x34325258)
modifier=0x0
size=3840x2160
layers:
size[0]=3840x2160
pitch[0]=15360
offset[0]=0
==> /sys/kernel/debug/dri/0/gem_names <==
name size handles refcount
==> /sys/kernel/debug/dri/0/HDMI-A-1 <==
head: error reading '/sys/kernel/debug/dri/0/HDMI-A-1': Is a directory
==> /sys/kernel/debug/dri/0/HDMI-A-2 <==
head: error reading '/sys/kernel/debug/dri/0/HDMI-A-2': Is a directory
==> /sys/kernel/debug/dri/0/HDMI-A-3 <==
head: error reading '/sys/kernel/debug/dri/0/HDMI-A-3': Is a directory
==> /sys/kernel/debug/dri/0/internal_clients <==
==> /sys/kernel/debug/dri/0/name <==
amdgpu dev=0000:0b:00.0 unique=0000:0b:00.0
==> /sys/kernel/debug/dri/0/state <==
plane[37]: plane-0
crtc=(null)
fb=0
crtc-pos=0x0+0+0
src-pos=0.000000x0.000000+0.000000+0.000000
rotation=1
normalized-zpos=0
color-encoding=ITU-R BT.601 YCbCr
color-range=YCbCr limited range
plane[38]: plane-1
==> /sys/kernel/debug/dri/0/ttm_page_pool <==
pool refills pages freed size
wc 8 0 353
uc 0 0 0
wc dma 0 0 0
uc dma 0 0 0
wc huge 0 0 0
uc huge 0 0 0
head: cannot open '2' for reading: No such file or directory
More information about the amd-gfx
mailing list