After Vega 56/64 GPU hang I unable reboot system

StDenis, Tom Tom.StDenis at amd.com
Wed Dec 19 20:51:52 UTC 2018


No gfx ring? You can specify a ring name for --waves should be in the docs.

It's not on the web docs but in the help text

https://cgit.freedesktop.org/amd/umr/tree/src/app/main.c#n643

I'll fix the web docs when I'm in next.

Tom

On December 19, 2018 3:21:25 PM EST, "Grodzovsky, Andrey" <Andrey.Grodzovsky at amd.com> wrote:

+Tom

Andrey


On 12/19/2018 01:35 PM, Mikhail Gavrilov wrote:
On Tue, 18 Dec 2018 at 00:08, Grodzovsky, Andrey
<Andrey.Grodzovsky at amd.com> wrote:
 Please install UMR and dump gfx ring content and waves after the hang is
 happening.

 UMR at - https://cgit.freedesktop.org/amd/umr/
 Waves dump
 sudo umr -O verbose,halt_waves -wa
 GFX ring dump
 sudo umr -O verbose,follow -R gfx[.]

 Andrey

 Thanks for respond.

 What options should I specify in kernel command line?

 On my setup `umr` terminated with message `Could not open ring debugfs
 file` and crashes. But I am sure that debugfs enabled.

 $ sudo umr -O verbose,halt_waves -wa
 Cannot seek to MMIO address: Bad file descriptor
 [ERROR]: Could not open ring debugfs fileSegmentation fault


 # ls /sys/kernel/debug/dri/0/
   amdgpu_dm_dtn_log        amdgpu_ring_comp_1.1.0     amdgpu_vram_mm
   amdgpu_evict_gtt         amdgpu_ring_comp_1.1.1     amdgpu_wave
   amdgpu_evict_vram        amdgpu_ring_comp_1.2.0     clients
   amdgpu_fence_info        amdgpu_ring_comp_1.2.1     crtc-0
   amdgpu_firmware_info     amdgpu_ring_comp_1.3.0     crtc-1
   amdgpu_gca_config        amdgpu_ring_comp_1.3.1     crtc-2
   amdgpu_gds_mm            amdgpu_ring_gfx            crtc-3
   amdgpu_gem_info          amdgpu_ring_kiq_2.1.0      crtc-4
   amdgpu_gpr               amdgpu_ring_sdma0          crtc-5
   amdgpu_gpu_recover       amdgpu_ring_sdma1          DP-1
   amdgpu_gtt_mm           'amdgpu_ring_uvd<0>'        DP-2
   amdgpu_gws_mm           'amdgpu_ring_uvd_enc0<0>'   DP-3
   amdgpu_iomem            'amdgpu_ring_uvd_enc1<0>'   framebuffer
   amdgpu_oa_mm             amdgpu_ring_vce0           gem_names
   amdgpu_pm_info           amdgpu_ring_vce1           HDMI-A-1
   amdgpu_regs              amdgpu_ring_vce2           HDMI-A-2
   amdgpu_regs_didt         amdgpu_sa_info             HDMI-A-3
   amdgpu_regs_pcie         amdgpu_sensors             internal_clients
   amdgpu_regs_smc          amdgpu_test_ib             name
   amdgpu_ring_comp_1.0.0   amdgpu_vbios               state
   amdgpu_ring_comp_1.0.1   amdgpu_vram                ttm_page_pool




 --
 Best Regards,
 Mike Gavrilov.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20181219/7fa6bd3b/attachment-0001.html>


More information about the amd-gfx mailing list