[Bug 207383] [Regression] 5.7 amdgpu/polaris11 gpf: amdgpu_atomic_commit_tail
bugzilla-daemon at bugzilla.kernel.org
bugzilla-daemon at bugzilla.kernel.org
Sun Jun 28 10:48:15 UTC 2020
https://bugzilla.kernel.org/show_bug.cgi?id=207383
Duncan (1i5t5.duncan at cox.net) changed:
What |Removed |Added
----------------------------------------------------------------------------
Kernel Version|5.7-rc1, 5.7-rc2, 5.7-rc3 |5.7-rc1 - 5.7 - 5.8-rc1+
--- Comment #31 from Duncan (1i5t5.duncan at cox.net) ---
(In reply to mnrzk from comment #30)
> In some conditions, when amdgpu_dm_atomic_commit_tail calls
> dm_atomic_get_new_state, dm_atomic_get_new_state returns a struct
> dm_atomic_state* with an garbage context pointer.
Good! Someone with the bug who can actually read and work the code, now.
Portends well for a fix. =:^)
> I've also found that this bug exclusively occurs when commit_work is on the
> workqueue. After forcing drm_atomic_helper_commit to run all of the commits
> without adding to the workqueue and running the OS, the issue seems to have
> disappeared.
I see it always with the workqueue too, but not being a dev I simply assumed
that was how it was; I had no idea it could be taken off the workqueue.
> The system was stable for at least 1.5 hours before I manually
> shut it down (meanwhile it has usually crashed within 30-45 minutes).
You're seeing a crash much faster than I am. I believe my longest uptime
before a crash with the telltale trace was something like two and a half days,
with the obvious implications for bisect good since it's always a gamble that
I've simply not tested long enough.
> Perhaps there's some sort of race condition occurring after commit_work is
> queued?
Agreed, FWIW, tho you've taken it farther than I could, not being able to work
with code much beyond bisect or modifying an existing patch here or there.
--
You are receiving this mail because:
You are watching the assignee of the bug.
More information about the dri-devel
mailing list