GM107GLM: kernel oops during link training when ior becomes NULL

Pavel Roskin plroskin at gmail.com
Sun Nov 5 05:47:01 UTC 2017


Hello!

I'm using Dell Precision P7510 with up-to-date Fedora 27. Everything
was working fine when I had two identical monitors (Dell 24")
connected to the dock with DisplayPort cables. Various issues started
when I replaced one of the monitors with a larger Dell 34" monitor. I
could work them around by using DVI to HDMI cables instead.

I tried to use DisplayPort cables again, and I saw that one of the
external monitors would show blank screen occasionally. Changing
display settings can restore the correct picture, but it doesn't
happen every time. Sometimes all monitors go black. I hit Escape to
revert the settings, and I get an oops.

It was happening with the stock Fedora kernel
(kernel-4.13.10-300.fc27.x86_64), so I compiled the current
airlied/drm-next, and I can still reproduce it.

Typically the oops is reported in nvkm_dp_acquire in
drivers/gpu/drm/nouveau/nvkm/engine/disp/dp.c. I found that the actual
oops happens in nvkm_dp_train_cr() on this line:

for (i = 0; i < lt->dp->outp.ior->dp.nr; i++) {

lt->dp->outp.ior becomes NULL for some reason. I started adding checks
for ior to be non-NULL, but the oopses started happening in other code
in the same source file. Every time ior would become NULL.

The issue is easy to reproduce. Change the display configuration using
GNOME, click Apply, press Escape quickly.

I checked the history of dp.c, and it looks like this commit may be responsible:

commit 75eefe95ee7565c695d1e736005876d18742537f
drm/nouveau/disp/dp: store current link configuration in nvkm_ior

Apparently, ior is not a safe place to keep that information, as it
can become NULL.

Not sure if it's related, but there are two messages from nouveau that
appear in the kernel log once in a while.

This happens regardless of the cable types:

nouveau 0000:01:00.0: disp: 0x00006878[0]: INIT_GENERIC_CONDITON: unknown 0x07

This only happens with DisplayPort cables. It's printed many times
when one of the monitors goes black.

nouveau 0000:01:00.0: disp: outp 0a:0006:0f42: training failed

I would be happy to provide more information and test patches.

-- 
Regards,
Pavel Roskin


More information about the dri-devel mailing list