[Bug 204181] NULL pointer dereference regression in amdgpu

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Sat Aug 17 05:13:28 UTC 2019


https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #32 from Sergey Kondakov (virtuousfox at gmail.com) ---
Just got exactly the same 0010:amdgpu_vm_update_directories+0xe7/0x260
dereference immediately on login even with PageFlip & TearFree disabled and
ShadowPrimary NOT enabled. Even with all the same addresses as before. So, now
I'm not sure about what actually triggers it. However, my setup is as
non-default as it gets:
amdgpu has these parameters: cik_support=1 si_support=1 msi=1 sched_policy=1
compute_multipipe=1 gartsize=1024 vm_fragment_size=9
max_num_of_queues_per_device=65536 sched_hw_submission=32 sched_jobs=1024
job_hang_limit=8000 halt_if_hws_hang=1 vm_fault_stop=0 vm_update_mode=3
vm_size=20 disp_priority=2 deep_color=1 gpu_recovery=1
irqbalance is enabled with interval=1 and rtirq has this:
RTIRQ_NAME_LIST="timer rtc snd drm amdgpu radeon i915 nvidia usb i8042 ahci"
RTIRQ_HIGH_LIST="watchdogd oom_reaper rcu_preempt rcu_sched rcu_bh rcub rcuc
gfx sdma ksoftirqd khugepaged"
RTIRQ_PRIO_HIGH=80
RTIRQ_PRIO_DECR=2
RTIRQ_PRIO_LOW=50
RTIRQ_RESET_ALL=0
to boost amdgpu's processes to highest RT/FIFO priorities in hope to avoid
video stuttering and audio x-runs under full load. Transparent hugepages are
enabled in attempt to spare crappy AMD FX's TLB cache and MMU (hence the
vm_fragment_size=9).

Maybe it's non-default vm_update_mode that does it. And few kernel versions
back default gart of 256MB was triggering some kind of fault, probably stall
and reset, maybe it even still does but I'm not going to check. Or maybe it's
all irrelevant.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list