[Bug 105425] 3D & games produce periodic GPU crashes (Radeon R7 370)

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Apr 10 09:13:14 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=105425

--- Comment #20 from iive at yahoo.com ---
(In reply to MirceaKitsune from comment #16)
> I have moved on to testing the various kernel parameters available for my
> driver and card. As was pointed out by malcolmlewis on the openSUSE forums,
> they can be listed with the following commands:
> 
> modinfo amdgpu
> systool -vm amdgpu
> 
> I tested nearly half of them today, almost none made any difference. There
> were however a few settings that appeared to influence the frequency of the
> freeze. The most notable one of all seems to be the following:
> 
> amdgpu.moverate=4
> 
> With no parameters changed, the freeze now occurs roughly once per 30
> minutes in Xonotic. With that move rate limited to 4MB/s however, I
> seemingly reduced it to only 90 minutes! The FPS will constantly drop and
> recover, but that makes sense as this setting explicitly limits the buffer
> migration rate.
> 
> I may test other variables in the days to come, but for now I'm hoping this
> offers at least some clue to get things started. My feeling is that the
> video card may be slowly loaded with information until something fills up,
> or perhaps some events throw too much data in at once and it reaches a
> bottleneck?

You are making a progress.

I just want to give you few tips.

1. You are always using 3D acceleration. The glamor driver that is used by XOrg
for 2D (DDX) acceleration is using EGL and shaders for drawing. If you have
composite manager (kde has one), it might do more load on it.
You might try "AccelMethod" "None" in xorg.conf, just to check if it makes any
difference. I hope that won't disable OpenGL entirely...

2. My videocard is also Gigabyte. I had it replaced ones, because in the first
month my initial card (same model) had major issues. Like not starting up at
boot after few hours of gameplay.

3. On my chip failure the pins affected were these controlling the internal
VideoRAM. If you have chip problems, it might affect other pins first, like the
PCIE ones. So HW problem is not ruled out.

4. PCIE standard allows using of less parallel lanes for data transfer. If
broken pins are suspected, moving to 4x slot might alleviate the issue.
BTW, I see that the card is on PCI_ID #3.00.1 , is it in the first slot?
Usually the first slow is 16x and has extra electric power.

5. If you suspect issue with filled RAM, you might try environment variable
"GALLIUM_HUD" it has some GTT displays.

6. In that manner of thinking. Make sure that kernel option for CMA is
disabled... that's been causing me problems every time I enable it. You might
also have IOMMU enabled, try disabling it, just for tests.

Once again, 
Keep digging and good luck.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180410/f7b2ebb1/attachment.html>


More information about the dri-devel mailing list