[Bug 204181] NULL pointer dereference regression in amdgpu

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Mon Aug 19 15:11:14 UTC 2019


https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #34 from Sergey Kondakov (virtuousfox at gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #33)
> I(In reply to Sergey Kondakov from comment #26)
> > Created attachment 284083 [details]
> > dmesg_2019-08-02-amdgpu_fail_on_patched_5.2.5
> > 
> > (In reply to Nicholas Kazlauskas from comment #24)
> > > This should be fixed with the series linked below:
> > > 
> > > https://patchwork.freedesktop.org/series/64505/
> > > 
> > > But it still needs review and backporting to older kernels.
> > 
> > Celebration might have been premature. Hours later I've got another freeze
> > with different error in amdgpu. Only this time, mouse cursor was movable
> > over frozen frame right until I tried switching VT. Here's trace:
> > BUG: unable to handle page fault for address: 0000000800000184
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x0000) - not-present page
> > PGD 0 P4D 0 
> > Oops: 0000 [#1] PREEMPT SMP NOPTI
> > CPU: 2 PID: 21044 Comm: kworker/u16:0 Tainted: G        W IO     
> > 5.2.5-1396.g79b6a9c-HSF #1 openSUSE Tumbleweed (unreleased)
> > Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3,
> BIOS
> > F14e 09/09/2014
> > Workqueue: events_unbound commit_work
> > RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2e6/0xd60 [amdgpu]
> 
> Are you able to consistently reproduce this issue? Is it the same setup and
> same conditions as before? I haven't been able to see it in my testing at
> least.

Yes, just having PageFlip enabled in amdgpu guarantees it. Changing anything
other than PageFlip doesn't seem to affect it. Forcing TearFree on with
PageFlip disabled may also trigger it, I think. You may try my previously
linked kernel build in your testing but I doubt that it has something specific
for it.

It may be not reproducible with modesetting X driver because it fails to engage
page flipping on init and throws a bunch of errors about it in Xorg.0.log. For
some reason I'm unable to use modesetting X driver at all, even with page
flipping disabled, it draws only mouse cursor on black background instead of
sddm login screen. So I have to use amdgpu with PageFlip and TearFree
explicitly disabled. But then another, rarer
0010:amdgpu_vm_update_directories+0xe7/0x260 dereference may happen regardless
(which I suspect is connected with vm_update_mode option, unlike the first
one).

By the way, is there any disadvantage in forcing TearFree to be always on when
it works ? Like additional frame of latency or something like that ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list