After Vega 56/64 GPU hang I unable reboot system

Mikhail Gavrilov mikhail.v.gavrilov at gmail.com
Wed Dec 19 18:35:54 UTC 2018


On Tue, 18 Dec 2018 at 00:08, Grodzovsky, Andrey
<Andrey.Grodzovsky at amd.com> wrote:
>
> Please install UMR and dump gfx ring content and waves after the hang is
> happening.
>
> UMR at - https://cgit.freedesktop.org/amd/umr/
> Waves dump
> sudo umr -O verbose,halt_waves -wa
> GFX ring dump
> sudo umr -O verbose,follow -R gfx[.]
>
> Andrey
>

Thanks for respond.

What options should I specify in kernel command line?

On my setup `umr` terminated with message `Could not open ring debugfs
file` and crashes. But I am sure that debugfs enabled.

$ sudo umr -O verbose,halt_waves -wa
Cannot seek to MMIO address: Bad file descriptor
[ERROR]: Could not open ring debugfs fileSegmentation fault


# ls /sys/kernel/debug/dri/0/
 amdgpu_dm_dtn_log        amdgpu_ring_comp_1.1.0     amdgpu_vram_mm
 amdgpu_evict_gtt         amdgpu_ring_comp_1.1.1     amdgpu_wave
 amdgpu_evict_vram        amdgpu_ring_comp_1.2.0     clients
 amdgpu_fence_info        amdgpu_ring_comp_1.2.1     crtc-0
 amdgpu_firmware_info     amdgpu_ring_comp_1.3.0     crtc-1
 amdgpu_gca_config        amdgpu_ring_comp_1.3.1     crtc-2
 amdgpu_gds_mm            amdgpu_ring_gfx            crtc-3
 amdgpu_gem_info          amdgpu_ring_kiq_2.1.0      crtc-4
 amdgpu_gpr               amdgpu_ring_sdma0          crtc-5
 amdgpu_gpu_recover       amdgpu_ring_sdma1          DP-1
 amdgpu_gtt_mm           'amdgpu_ring_uvd<0>'        DP-2
 amdgpu_gws_mm           'amdgpu_ring_uvd_enc0<0>'   DP-3
 amdgpu_iomem            'amdgpu_ring_uvd_enc1<0>'   framebuffer
 amdgpu_oa_mm             amdgpu_ring_vce0           gem_names
 amdgpu_pm_info           amdgpu_ring_vce1           HDMI-A-1
 amdgpu_regs              amdgpu_ring_vce2           HDMI-A-2
 amdgpu_regs_didt         amdgpu_sa_info             HDMI-A-3
 amdgpu_regs_pcie         amdgpu_sensors             internal_clients
 amdgpu_regs_smc          amdgpu_test_ib             name
 amdgpu_ring_comp_1.0.0   amdgpu_vbios               state
 amdgpu_ring_comp_1.0.1   amdgpu_vram                ttm_page_pool




--
Best Regards,
Mike Gavrilov.


More information about the amd-gfx mailing list