Regression: drm: Lobotomize set_busid nonsense for !pci drivers (a325725633c2)

Laszlo Ersek lersek at redhat.com
Fri Sep 30 10:03:59 UTC 2016


On 09/30/16 10:28, Hans de Goede wrote:
> Hi,
> 
> On 30-09-16 05:09, Laszlo Ersek wrote:
>> Hello Daniel,
>>
>> On 06/21/16 14:08, daniel.vetter at ffwll.ch (Daniel Vetter) wrote:
>>> We already have a fallback in place to fill out the unique from
>>> dev->unique, which is set to something reasonable in drm_dev_alloc.
>>>
>>> Which means we only need to have a special set_busid for pci devices,
>>> to be able to care the backwards compat code for drm 1.1 around, which
>>> libdrm still needs.
>>>
>>> While developing and testing this patch things blew up in really
>>> interesting ways, and the code is rather confusing in naming things
>>> between the kernel code, ioctl #defines and libdrm. For the next brave
>>> dragon slayer, document all this madness properly in the userspace
>>> interface section of gpu.tmpl.
>>>
>>> v2: Make drm_dev_set_unique static and update kerneldoc.
>>>
>>> v3: Entire rewrite, plus document what's going on for posterity in the
>>> gpu docbook uapi section.
>>>
>>> v4: Drop accidental amdgpu hunk (Emil).
>>>
>>> v5: Drop accidental omapdrm vblank counter change (Emil).
>>>
>>> Cc: Gustavo Padovan <gustavo.padovan at collabora.co.uk>
>>> Cc: Emil Velikov <emil.l.velikov at gmail.com>
>>> Tested-by: Gustavo Padovan <gustavo.padovan at collabora.co.uk>
>>> (virt_gpu)
>>> Reviewed-by: Emil Velikov <emil.l.velikov at gmail.com>
>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>> ---
>>>  Documentation/DocBook/gpu.tmpl                  |  4 ++
>>>  drivers/gpu/drm/armada/armada_drv.c             |  1 -
>>>  drivers/gpu/drm/drm_ioctl.c                     | 58
>>> +++++++++++++++++++++++++
>>>  drivers/gpu/drm/drm_platform.c                  | 18 --------
>>>  drivers/gpu/drm/etnaviv/etnaviv_drv.c           |  1 -
>>>  drivers/gpu/drm/exynos/exynos_drm_drv.c         |  1 -
>>>  drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c |  1 -
>>>  drivers/gpu/drm/imx/imx-drm-core.c              |  1 -
>>>  drivers/gpu/drm/msm/msm_drv.c                   |  1 -
>>>  drivers/gpu/drm/nouveau/nouveau_drm.c           |  1 -
>>>  drivers/gpu/drm/omapdrm/omap_drv.c              |  1 -
>>>  drivers/gpu/drm/shmobile/shmob_drm_drv.c        |  1 -
>>>  drivers/gpu/drm/tilcdc/tilcdc_drv.c             |  1 -
>>>  drivers/gpu/drm/virtio/virtgpu_drm_bus.c        | 10 -----
>>>  drivers/gpu/drm/virtio/virtgpu_drv.c            |  1 -
>>>  drivers/gpu/drm/virtio/virtgpu_drv.h            |  1 -
>>>  include/drm/drmP.h                              |  1 -
>>>  17 files changed, 62 insertions(+), 41 deletions(-)
>>
>> This patch (commit a325725633c2) regresses X.org on QEMU's virtio-vga
>> device. Please see
>>
>>   https://bugzilla.redhat.com/show_bug.cgi?id=1366842
>>
>> complete with a bisection log under
>>
>>   drivers/gpu/drm/virtio/
>>
>> (comment 20).
>>
>> Copying Thorsten so he can include this report in his next v4.8-rc8
>> regression report, if he chooses so. (Commit a325725633c2 is part of
>> v4.8-rc1, but we only managed to identify it now.) The last such report
>> I know of is archived e.g. at
>> <http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1239220.html>.
>>
>>
>> Reported-by: Joachim Frieben <jfrieben at hotmail.com>
> 
> First of all Joachim thanks for bisecting this.

(Small correction: while Joachim reported the BZ, the bisection was done
by yours truly. The bisection was painful enough that I'd want to take
"credit" for it -- using the stock Fedora kernel config for the
bisection, I think my laptop must have burned through enough electricity
to power a small town from Christmas to New Year's Eve. I *literally*
took naps between the test cycles. (And I mean literally literally.) I
know about "localmodconfig" but it has broken on me before, so I opted
for the Fedora config.)

> I was thinking about this
> bug / issue, while doing my laps in the swimming pool.

If you do that, it's easy to lose count of your laps ;)

> I wanted to add a comment to the bug to tell you that this is likely
> a Xorg xserver issue and not a kernel issue and that there is no need to
> bisect, but it is too late for that now.

Ouch. :/

> Xorg when running without a Xorg.conf searches for what it considers
> a "primary" gpu / video-card, basically it attempts to bring up the
> right card in setups where there are multiple cards and if it does not
> find one exits with an error.
> 
> The xserver has a 2 step process for finding the primary card:
> 
> 1) It searches for is a card which has a vga-bios mapped,
> as we've already determined in the mentioned Red Hat bug that works for
> the classic qemu emulated video-cards, but not for qemu's virtio-vga.
> 
> 2) If that does not work Xorg will fallback to any video class device
> on pci-bus 1.
> 
> This fallback actually has been broken in the Xorg xserver for quite a
> while now and only 2 days ago a patch from Laszlo was merged to fix this.
> 
> Only for things to break again due to this kernel patch.
> 
> Since the whole step 2) thingie is very much tied to x86 machines
> where pci-bus 0 used to be the main bus and pci-bus 1 the agp,
> which is sorta an obsolete assumption now a days and  since relying
> on bus numbers / enumeration order is a bad idea in general I'm not
> entirely sure if this counts as a regression.
> 
> I've discussed the problem of the xserver exiting with an error when
> no primary device can be found with some people (ajax) at XDC last week
> since there are other use-cases where the pci-bus 1 fallback does not
> work.
> 
> As such I've been working on a xserver patch-set to make the xserver
> try harder (pick the first available device) when both steps described
> above fail to find one, which should make things work even with the
> newest (broken / regressed) kernels.
> 
> Given this mail thread, I guess I'm working after all today (I had
> planned a day off)

Apologies...

> and I'll try to wrap up this patch-set and reply
> to this mail with the server patches attached for Joachim and/or
> Laszlo to test.

Thank you Hans, that's very kind of you. (And I also greatly appreciate
your description of the primary card selection logic.)

> p.s.
> 
> It would be interesting to do a lspci on both a working and a
> non-working kernel to see what exactly is going on here.

I'll upload the outputs to the RHBZ soon.

Thanks!
Laszlo


More information about the dri-devel mailing list