[Bug 108625] AMDGPU - Can't even get Xorg to start - Kernel driver hangs with ring buffer timeout on ARM64

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Nov 5 15:32:20 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=108625

--- Comment #12 from Alex Deucher <alexdeucher at gmail.com> ---
(In reply to Carsten Haitzler from comment #10)
> so wouldn't that make it a necessity then if its even glamor needing it? i
> guess i can turn off glamor accel but realistically gl is a necessity so the
> problem needs to be addressed sooner or later.
> 

If you were starting a bare x server, you usually don't hit the glamor paths
too extensively compared to a full desktop environment.

> the ring gfx timeout smells to me of "not a mesa bug" in that an ioctl going
> to the drm driver never returns qhen doing a simple query. it hangs, thus
> something lower down that is having a bad day, if something as simple as
> querying a fence causes a hang... :)
> 
> what is this ring gfx thing exactly (seems to be some command queue) and why
> would it be timing out? all the way back at seq 10/11 ... like right at the
> start of its use? it's almost like some interrupt or in memory semaphore
> thing mapped from the card is messing up? i'm looking for something to look
> into more specifically.

Each engine on the GPU (gfx, compute, video decode, encode, dma, etc.) has a
ring buffer used to feed it.  The work sent to the engines is managed by a sw
scheduler in the kernel. The kernel driver tests the rings as part of the
driver init sequence.  The driver won't come up if the ring tests fail so they
are working at least until you start X.  Presumably X submits (via glamor) some
work to the GPU which causes the GPU to hang.  The fence never signals because
the GPU never finished processing the job due to the hang.

Another simplier test would be to boot up to a console (no X) and then try
running some of the libdrm amdgpu tests.  They are really simple (copying data
and round and verifying it using different engines, allocating freeing memory,
etc.).
https://cgit.freedesktop.org/mesa/drm/tree/tests/amdgpu
See if some of the simple copy or write tests work.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20181105/bc109789/attachment.html>


More information about the dri-devel mailing list