Regression: drm: Lobotomize set_busid nonsense for !pci drivers (a325725633c2)

Hans de Goede hdegoede at redhat.com
Fri Sep 30 16:38:01 UTC 2016


Hi,

On 30-09-16 17:33, Laszlo Ersek wrote:
> On 09/30/16 16:59, Hans de Goede wrote:
>> Hi,
>>
>> On 30-09-16 16:51, Laszlo Ersek wrote:
>>> On 09/30/16 12:35, Hans de Goede wrote:
>>>
>>>> Attached are 2 patches against the xserver which should fix this,
>>>> please give them a try.
>>>
>>> Sorry about the delay.
>>>
>>> The patches don't seem to fix the issue for me. Please see the Xorg log
>>> attached.
>>>
>>> I tested the patches as follows. Given that my bisection had been done
>>> in a Fedora 24 guest, using
>>>
>>>   xorg-x11-server-1.18.4-4.fc24
>>>   http://koji.fedoraproject.org/koji/buildinfo?buildID=794494
>>>
>>> I now rebuilt the guest kernel exactly at the failing commit (a325725
>>> "drm: Lobotomize set_busid nonsense for !pci drivers"), and first
>>> reproduced the issue with the above X server.
>>>
>>> Then, I ported your patches to "xorg-server-1.18.4" (using the upstream
>>> xserver tree), and rebuilt the Fedora package with the backport. For the
>>> backport, I had to cherry-pick the following two patches from master
>>> first:
>>>
>>> 1 ca8d88e50310 xfree86: recognize primary BUS_PCI device in
>>>                xf86IsPrimaryPlatform()
>>> 2 ea91db4b8331 config: fix GPUDevice fail when AutoAddGPU off + BusID
>>>
>>> This way your patches applied cleanly. (Cherry pick #1 above is actually
>>> necessary for semantics, while cherry pick #2 is needed for a clean
>>> context only, and has no impact for this test.)
>>>
>>> That is, in total, I added the following four patches to the Fedora 24
>>> package:
>>>
>>> 1 xfree86: recognize primary BUS_PCI device in xf86IsPrimaryPlatform()
>>> 2 config: fix GPUDevice fail when AutoAddGPU off + BusID
>>> 3 xfree86: Make adding unclaimed devices as GPU devices a separate step
>>> 4 xfree86: Try harder to find atleast 1 non GPU Screen
>>>
>>> You can find the scratch build that I used for testing here:
>>>
>>>   xorg-x11-server-1.18.4-4.hans_bz1366842_2.fc24
>>>   http://koji.fedoraproject.org/koji/taskinfo?taskID=15875087
>>>
>>> Another reason I used F24's X server as basis, rather than upstream
>>> HEAD, is that Fedora 24 is pretty young, and it's already on kernel
>>> 4.7.4, and I believe it will soon move to kernel 4.8, without
>>> (necessarily) rebasing its X server package to upstream. IOW the kernel
>>> upgrade to 4.8 will break X in Fedora 24 too, and then I expect the
>>> Fedora X maintainers would have to cherry pick those two patches as
>>> dependencies just the same.
>>>
>>> To summarize, the patches don't seem to help. I shall nonetheless thank
>>> you for spending your Friday on this!
>>
>> Hmm, do you have a xorg.conf file lying around somewhere, the message
>> about the xserver not being able to find an entry for screen 0 does
>> not make sense ...
>
> Good catch, I actually had two files under "/etc/X11/xorg.conf.d/":
>
> * "00-keyboard.conf", from package "systemd-229-13.fc24.x86_64", with
> contents
>
> ------------
> # Read and parsed by systemd-localed. It's probably wise not to edit
> this file
> # manually too freely.
> Section "InputClass"
>         Identifier "system-keyboard"
>         MatchIsKeyboard "on"
>         Option "XkbLayout" "us"
> EndSection
> ------------
>
> * "01-resolution.conf", which I had created, in order to set the
> preferred display resolution:
>
> ------------
> Section "Screen"
>   Identifier "Default Screen"
>   Device     "Default Device"
>   Monitor    "Default Monitor"
> EndSection
>
> Section "Device"
>   Identifier "Default Device"
>   Driver     "modesetting"
> EndSection
>
> Section "Monitor"
>   Identifier "Default Monitor"
>   Option     "PreferredMode"   "640x480"
> # Option     "PreferredMode"   "1440x900"
> EndSection
> ------------
>
> I removed these files now, and repeated the test. Again, the X server
> wouldn't start, but I think the log file looks a bit different now.
> Attached.

Ah, ok so it seems that my initial analysis is wrong, the problem
is not a re-occuring of the device getting identified as a GPU screen,
libdrm sorta depends on bus-ids and the lack of one is causing the
server to misbehave. I guess that even with a xorg.conf things
will fail with the troublesome kernel version (might be worth
trying).

Emil's analysis seems to be spot on. This does not seem easily
fixable in userspace / does seem like a real regression as it
even breaks things when specifying the device through xorg.conf
(I or so I believe) which is something which uses to work ...

I made the mistake of thinking the kernel change was re-triggering
the old problem Laszlo fixed, but that does not seem to be the
case.

Regards,

Hans


More information about the dri-devel mailing list