[Mesa-dev] [Bug 108900] [KBL-G][Vulkan] Non-recoverable GPU hangs with GfxBench v5 Aztec Ruins Vulkan test

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Mar 7 15:31:17 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=108900

--- Comment #9 from Eero Tamminen <eero.t.tamminen at intel.com> ---
Created attachment 143572
  --> https://bugs.freedesktop.org/attachment.cgi?id=143572&action=edit
Hang trace

(In reply to Samuel Pitoiset from comment #8)
> Again, without the demo is hard to fix.

While GfxBench v5 / AztecRuins seems still to be proprietary for Desktop Linux
(available for free only on Windows & Android), (recoverable) Manhattan hangs
in bug 108898 can be tested with the public GfxBench v4 version.


> Can you try 'export RADV_DEBUG=nodcc,nohiz,zerovram,nofastclears' ?
> 
> If it still hangs

Yes, it still hangs, just less verbosely.

dmesg:
[  546.116535] amdgpu 0000:01:00.0: GPU fault detected: 146 0x0fa0880c for
process testfw_app pid 1859 thread testfw_app pid 1860
[  546.116538] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x001001F4
[  546.116539] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0808800C
[  546.116541] amdgpu 0000:01:00.0: VM fault (0x0c, vmid 4, pasid 32772) at
page 1049076, read from 'TC4' (0x54433400) (136)
[  556.201073] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
signaled seq=11253, emitted seq=11254
[  556.201101] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information:
process testfw_app pid 1859 thread testfw_app pid 1860
[  556.201104] amdgpu 0000:01:00.0: GPU reset begin!
[  556.616910] cp is busy, skip halt cp
[  556.805398] rlc is busy, skip halt rlc
[  556.806410] amdgpu 0000:01:00.0: GPU pci config reset
[  556.818925] amdgpu 0000:01:00.0: GPU reset succeeded, trying to resume
[  556.818962] [drm] PCIE GART of 256M enabled (table at 0x000000F4007E9000).
[  556.818991] [drm:amdgpu_device_gpu_recover [amdgpu]] *ERROR* VRAM is lost!
[  556.896623] [drm] UVD and UVD ENC initialized successfully.
[  556.997551] [drm] VCE initialized successfully.
[  557.007168] [drm] recover vram bo from shadow start
[  557.012867] [drm] recover vram bo from shadow done
[  557.012869] [drm] Skip scheduling IBs!
[  557.012956] amdgpu 0000:01:00.0: GPU reset(2) succeeded!
[  557.013063] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize
parser -125!
...

Application:
--------------
Warm up Generate SH shader...

Workgroup size: 8
compile deferred_irradiance_volumes/m_envprobe_generate_sh_compute.shader...
done

amdgpu: radv_amdgpu_cs_query_fence_status failed.
glVkError: 2 line: 4329 func: Finish
amdgpu: radv_amdgpu_cs_query_fence_status failed.
glVkError: 2 line: 4219 func: BeginCommandBuffer
amdgpu: The CS has been rejected, see dmesg for more information.
vk: error: failed to submit CS 0
--------------


> generating a hang report might help

> export RADV_TRACE_FILE=$HOME/hang.trace
> export RADV_DEBUG=allbos,vmfaults,zerovram,syncshaders

Hang trace attached.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190307/7e859020/attachment.html>


More information about the mesa-dev mailing list