[Mesa-dev] [Bug 108900] GPU hangs with GfxBench v5 Aztec Ruins Vulkan test

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Nov 29 13:11:35 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=108900

            Bug ID: 108900
           Summary: GPU hangs with GfxBench v5 Aztec Ruins Vulkan test
           Product: Mesa
           Version: git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: Drivers/Vulkan/radeon
          Assignee: mesa-dev at lists.freedesktop.org
          Reporter: eero.t.tamminen at intel.com
        QA Contact: mesa-dev at lists.freedesktop.org

Setup:
- FullHD monitor (through HDMI KVM)
- HadesCanyon KBL i7-8809G ([AMD/ATI] Vega [Radeon RX Vega M] (rev c0))
- Ubuntu 18.04
- drm-tip git kernel v4.20-rc4 (i.e. kernel.org v4.20-rc4 kernel + latest drm
code from yesterday)
- Mesa git (c120dbfe4d)
- X server git version
- Proprietary GfxBench v5-GOLD2:  http://gfxbench.com

Test-case:
* bin/testfw_app --gfx vulkan --gl_api vulkan --width 1920 --height 1080
--fullscreen 1 --test_id vulkan_5_normal

Expected outcome:
* Works fine like the Aztec Ruins GL version and Sacha Willems' Vulkan tests,
no GPU hangs

Actual outcome:
* Right after test starts, following in dmesg:
-----
[ 3057.480868] amdgpu 0000:01:00.0: GPU fault detected: 146 0x0fa0880c for
process testfw_app pid 2995 thread testfw_app pid 2997
[ 3057.480870] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x001001F4
[ 3057.480871] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0C08800C
[ 3057.480873] amdgpu 0000:01:00.0: VM fault (0x0c, vmid 6, pasid 32772) at
page 1049076, read from 'TC4' (0x54433400) (136)
[ 3057.480879] amdgpu 0000:01:00.0: GPU fault detected: 146 0x0fa0840c for
process testfw_app pid 2995 thread testfw_app pid 2997
[ 3057.480880] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x001001FD
[ 3057.480881] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0C08400C
[ 3057.480883] amdgpu 0000:01:00.0: VM fault (0x0c, vmid 6, pasid 32772) at
page 1049085, read from 'TC5' (0x54433500) (132)
[ 3057.480944] amdgpu 0000:01:00.0: GPU fault detected: 146 0x0fa9080c for
process testfw_app pid 2995 thread testfw_app pid 2997
[ 3057.480945] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x00000000
[ 3057.480946] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x0C18802C
[ 3057.480947] amdgpu 0000:01:00.0: VM fault (0x2c, vmid 6, pasid 32772) at
page 0, read from 'TC0' (0x54433000) (392)
[ 3067.564630] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
signaled seq=53811, emitted seq=53814
[ 3067.564633] [drm] GPU recovery disabled.
-----

After this, no other GPU operations seem to work properly.  There are also
other things that don't work properly in automated testing at this point, but
I'm not sure whether they're related.

No idea whether this is a regression as I checked it only now.  There are some 
issues with this particular test also on Intel (see e.g. bug 104634, bug
105276), so the problem could be in common code.  No idea whether this is
related to GL bug 108898 on same device.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181129/c398ec73/attachment.html>


More information about the mesa-dev mailing list