After Vega 56/64 GPU hang I unable reboot system

StDenis, Tom Tom.StDenis at amd.com
Thu Dec 20 14:06:47 UTC 2018


On 2018-12-20 6:45 a.m., Mikhail Gavrilov wrote:
> On Thu, 20 Dec 2018 at 16:17, StDenis, Tom <Tom.StDenis at amd.com> wrote:
>>
>> Well yup the kernel is not letting you open the files:
>>
>>
>> As sudo/root you should be able to open these files with umr.  What
>> happens if you just open a shell as root and run it?
>>
> 
> [root at localhost ~]# touch /sys/kernel/debug/dri/0/amdgpu_ring_gfx
> [root at localhost ~]# cat /sys/kernel/debug/dri/0/amdgpu_ring_gfx
> cat: /sys/kernel/debug/dri/0/amdgpu_ring_gfx: Operation not permitted
> [root at localhost ~]# ls -laZ /sys/kernel/debug/dri/0/amdgpu_ring_gfx
> -r--r--r--. 1 root root system_u:object_r:debugfs_t:s0 8204 Dec 20
> 16:31 /sys/kernel/debug/dri/0/amdgpu_ring_gfx
> [root at localhost ~]# getenforce
> Permissive
> [root at localhost ~]# /home/mikhail/packaging-work/umr/build/src/app/umr
> -O verbose,halt_waves -wa
> Cannot seek to MMIO address: Bad file descriptor
> [ERROR]: Could not open ring debugfs fileSegmentation fault (core dumped)
> 
> I am already tried launch `umr` under root user, but kernel don't let
> open `amdgpu_ring_gfx` again.
> 
> What else kernel options I should to check?
> 
> I am also attached current kernel config to this message.

I can replicate this by doing

chmod u+s umr
sudo ./umr -R gfx[.]

You need to remove the u+s bit you are literally not running umr as root!

:-)

Tom


> 
> --
> Best Regards,
> Mike Gavrilov.
> 



More information about the amd-gfx mailing list