Deadlocks with multiple applications on AMD RX460

Luís Mendes luis.p.mendes at gmail.com
Tue Dec 5 11:55:42 UTC 2017


Hi,

I don't know how to start with, there are multiple issues, but maybe I
can start with the issue in amdgpu driver in linux-4.15-rc2.

There is some patch that is quite relevant for the AMD RX460 to
initialize properly which is in both the development branches
~agd5f/amd-staging-drm-next (which got in before commit
85d09ce5e5039644487e9508d6359f9f4cf64427) and ~agd5f/drm-bext-4.16-wip
(at commit 968b8381ac1857ae26aa16d988b4d07018e56098), but is not in
Vanilla kernel linux-4.15-rc1 nor in linux-4.15-rc2, thus I get
"amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD)" and cannot
make use of the card. I wasn't able to find which is the relevant
patch for making it work yet, and I have tried quite some patches, by
cherry picking into Vanilla kernel 4.15-rc2.

Secondly, when using the above referenced git kernels
~agd5f/amd-staging-drm-next and ~agd5f/drm-bext-4.16-wip I get random
deadlocks on applications namely glmark2, kodi, firefox  making the X
freeze while playing a video or rendering images, sometimes the mouse
cursor still moves, but the UI is completely irresponsive.
What seems to be common is that the deadlock happens on a ioctl kernel
call, namely, amdgpu_ioctl_wait_cs with an unlimited timeout of
2^64-1. Sometimes the deadlocks happen after 20 minutes, other times
almost after starting the applications, but always happen before the
20 minutes time frame.

With kernel ~agd5f/drm-bext-4.16-wip (at commit
968b8381ac1857ae26aa16d988b4d07018e56098) there is an immediate and
systematic deadlock which always happen in glmark2 at the start of the
terrain test.

Stack traces follow below.

Regards,
Luís Mendes
Hardware and Software engineer

The stacktrace for glmark2 after deadlock is:
#0  0xb6b9b246 in ioctl () at ../sysdeps/unix/syscall-template.S:84
#1  0xb6943dc6 in drmIoctl (fd=6, request=3223348297, arg=arg at entry=0xbea08e88)
    at ../xf86drm.c:191
#2  0xb5f4f128 in amdgpu_ioctl_wait_cs (context=<optimized out>,
    context=<optimized out>, busy=<synthetic pointer>, flags=<optimized out>,
    timeout_ns=18446744073709551615, handle=5, ring=<optimized out>,
    ip_instance=<optimized out>, ip=<optimized out>)
    at ../../amdgpu/amdgpu_cs.c:408
#3  amdgpu_cs_query_fence_status (fence=<optimized out>,
    timeout_ns=<optimized out>, flags=1, expired=0xbea08ef0)
    at ../../amdgpu/amdgpu_cs.c:437
...


The stacktrace for kodi is:
#0  0xb4891246 in ioctl () at ../sysdeps/unix/syscall-template.S:84
#1  0xb60ecdc6 in drmIoctl (fd=22, request=3223348297,
    arg=arg at entry=0xbeb04230) at ../xf86drm.c:191
#2  0xababe128 in amdgpu_ioctl_wait_cs (context=<optimized out>,
    context=<optimized out>, busy=<synthetic pointer>, flags=<optimized out>,
    timeout_ns=18446744073709551615, handle=135520, ring=<optimized out>,
    ip_instance=<optimized out>, ip=<optimized out>)
    at ../../amdgpu/amdgpu_cs.c:408
#3  amdgpu_cs_query_fence_status (fence=fence at entry=0x3681f28,
    timeout_ns=<optimized out>, flags=1, expired=expired at entry=0xbeb04298)
    at ../../amdgpu/amdgpu_cs.c:437
#4  0xac05dffe in amdgpu_fence_wait (fence=0x3681f18, timeout=<optimized out>,
    absolute=false)
    at ../../../../../../src/gallium/winsys/amdgpu/drm/amdgpu_cs.c:187
#5  0xac05e072 in amdgpu_fence_wait_rel_timeout (rws=<optimized out>,
    fence=<optimized out>, timeout=<optimized out>)
    at ../../../../../../src/gallium/winsys/amdgpu/drm/amdgpu_cs.c:209
#6  0xabfaf066 in si_fence_finish (screen=<optimized out>, ctx=0x0,
    fence=0x310d690, timeout=18446744073709551615)
    at ../../../../../src/gallium/drivers/radeonsi/si_fence.c:288
#7  0xabd86cc6 in dri_flush (cPriv=<optimized out>, dPriv=<optimized out>,
    flags=<optimized out>, reason=<optimized out>)
    at ../../../../../src/gallium/state_trackers/dri/dri_drawable.c:563
#8  0xb6af4416 in dri2Flush (psc=psc at entry=0x12c0f98, ctx=<optimized out>,
    draw=draw at entry=0x14682d8, flags=flags at entry=3, throttle_reason=
    __DRI2_THROTTLE_SWAPBUFFER) at ../../../src/glx/dri2_glx.c:559
#9  0xb6af46ce in dri2SwapBuffers (pdraw=0x14682d8,
    target_msc=<optimized out>, divisor=0, remainder=0, flush=1)
    at ../../../src/glx/dri2_glx.c:851
#10 0xb6ad68e8 in glXSwapBuffers (dpy=<optimized out>,
    drawable=<optimized out>) at ../../../src/glx/glxcmds.c:839
#11 0x00dc1730 in CGLContextGLX::SwapBuffers (this=0x13df640)
    at GLContextGLX.cpp:267
#12 0x00dc06ca in CWinSystemX11GLContext::PresentRenderImpl (this=0x126c480,
    rendered=<optimized out>) at WinSystemX11GLContext.cpp:50
#13 0x00f80404 in CRenderSystemGL::PresentRender (this=0x126c558,
    rendered=<optimized out>, videoLayer=<optimized out>)
    at RenderSystemGL.cpp:299
#14 0x008ac4a6 in CGraphicContext::Flip (this=this at entry=0x12658c8,
    rendered=rendered at entry=true, videoLayer=<optimized out>)
    at GraphicContext.cpp:981
#15 0x00cc7140 in CApplication::Render (this=0x1266dd0) at Application.cpp:1958
#16 0x00d3bae4 in CXBApplicationEx::Run (this=0x1266dd0, playlist=...)
    at XBApplicationEx.cpp:142
#17 0x00b5662c in XBMC_Run (renderGUI=<optimized out>, playlist=...)
    at xbmc.cpp:89
#18 0x007d2f20 in main (argc=1, argv=0xbeb04904) at main.cpp:79


More information about the amd-gfx mailing list