[Bug 110674] Crashes / Resets From AMDGPU / Radeon VII

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Aug 26 03:47:41 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=110674

--- Comment #121 from ReddestDream <reddestdream at gmail.com> ---
Some observations:

1. Nothing at all seems to be up with cur_speed and cur_width. They get set
several times in a row in both runs, but the values are all the same in both.

2. I can't really see anything up with msg/parameter either. When I compare
them to each other nothing seems particularly wacky. And we also have an
instance in my AMD+iGPU run where we see msg/parameter after "[drm] Initialized
amdgpu", so the theory that all messages have to be sent before Initialization
is complete must be wrong.

Now the real question is if we can decode what these msg/parameter values mean.
But it looks more likely to me that vega20_hwmgr.c and vega20_ppt.c are just
bugged somewhere (probably in the same way since they seem to be alternate
versions of each) and that the rest of the amdgpu code is (relatively) fine.

I'm thinking we'll have to go through and knock out/debug pretty much
everything in those files until we figure out where the breakage is. That's
about 3000-4000 lines of code in each of those two files tho. So any thoughts
anyone has about where we should start would be helpful. My focus will probably
be on UCLK (since it seems to break first), SCLK (since it gets set to 0 MHz
when there's multiple displays), DCEFCLK, and basically anything else that
smells like it might control the memory clock and/or be affected by multiple
monitors.

Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190826/26a858ee/attachment.html>


More information about the dri-devel mailing list