[PATCH xserver 4/4] xfree86: promote one GPU screen if (NumScreens == 0 && NumGPUScreens > 0)

Laszlo Ersek lersek at redhat.com
Wed Sep 7 10:20:27 UTC 2016


On 09/07/16 11:54, Hans de Goede wrote:
> Hi,
> 
> On 06-09-16 22:32, Laszlo Ersek wrote:
>> On 09/06/16 14:46, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 06-09-16 13:47, Laszlo Ersek wrote:
>>>> Adding Matt and Ard
>>>>
>>>> On 09/06/16 00:32, Dave Airlie wrote:
>>>
>>> <snip>
>>>
>>>> Picking different primaries on different boots sounds perfectly
>>>> acceptable to me as long as:
>>>> - no explicit DeviceId.BusID is specified by the user,
>>>> - no primary emerges otherwise,
>>>> - there are several secondaries.
>>>>
>>>> This behavior is clearly not ideal in the long term (like at the third,
>>>> fourth, fifth etc boots of the same machine), but the user can
>>>> trivially
>>>> fix it *after* the installation finishes (by setting Device.BusID).
>>>> Plus, the suggested documentation (man page) and the server log are
>>>> clear about it.
>>>>
>>>> Importantly, the suggested behavior (even if non-deterministic) is
>>>> clearly superior to the X Server not starting at all. "Output
>>>> somewhere"
>>>> is better than no output at all. (And, in 99% of the cases, there's
>>>> only
>>>> one secondary, which gives exactly the expected result.)
>>>>
>>>> ... If it made the series acceptable, I'd be happy to tweak patch #4 so
>>>> that the code get active only for
>>>>
>>>>   (NumScreens == 0 && NumGPUScreens == 1)
>>>
>>> I agree that this is a reasonable thing to do, if there is only
>>> one GPUScreen we don't get the problem of which GPUScreen to pick
>>> and then falling back to the single available GPUScreen seems
>>> like the sensible thing to do.
>>>
>>> That still leaves the problem though, that as I wrote on your original
>>> submission, doing this after the probing is done is too late, because
>>> the flags argumet passed into the probeSingleDevice call in
>>> hw/xfree86/common/xf86platformBus.c: xf86platformProbeDev()
>>> (and passed into the driver from there)  must match if the screen is
>>> going to be a normal screen or a GPU-screen.
>>
>> It is true that patch #3 does not take care of this universally (i.e.,
>> in a future-proof way).
>>
>> It does take care of all of the *current* effects of
>> PLATFORM_PROBE_GPU_SCREEN, which is that the modesetting driver sets
>> XF86_ALLOCATE_GPU_SCREEN for xf86AllocateScreen().
>>
>> That in turn makes a difference for the "scrnIndex" and "is_gpu" fields
>> of the newly allocated ScrnInfoRec object.
>>
>> And, patch #3 does handle those as part of the promotion.
>>
>> If the modesetting driver (or any other driver) used
>> PLATFORM_PROBE_GPU_SCREEN for anything more than passing
>> XF86_ALLOCATE_GPU_SCREEN to xf86AllocateScreen(), then this approach
>> wouldn't work.
> 
> And as I've already tried to tell you, the modesetting driver
> does actually use it for more then just XF86_ALLOCATE_GPU_SCREEN.
> 
> Specifically in its ScreenInit it looks at isGPU and sets
> some internal flags based on that, so clearing isGPU after probing
> is too late.

Ah! Okay.

I obviously didn't ignore your feedback, and I did grep the tree for XF86_ALLOCATE_GPU_SCREEN (and saw its only one *direct* use in the modesetting driver that I already mentioned). However, I did miss the fact that the modesetting driver looks at is_gpu after xf86AllocateScreen() returns.

So yea, patch #3 is entirely busted. Thanks for explaining that in depth.

> 
> Either way this is just really fragile and needlessly complicated.
> 
>>> Hmm, looking at this closer, the easy fix I had in mind is not
>>> possible, at least not in the place where I wanted to add it.
>>>
>>> Looking even further, we should already do a fallback to the first
>>> non-primary dev for seat0 servers and at the right time (before calling
>>> probe), take a look at: hw/xfree86/common/xf86Bus.c: xf86BusProbe()
>>>
>>> void
>>> xf86BusProbe(void)
>>> {
>>> #ifdef XSERVER_PLATFORM_BUS
>>>     xf86platformProbe();
>>>     if (ServerIsNotSeat0() && xf86_num_platform_devices > 0)
>>>         return;
>>> #endif
>>> #ifdef XSERVER_LIBPCIACCESS
>>>     xf86PciProbe();
>>> #endif
>>> #if (defined(__sparc__) || defined(__sparc)) && !defined(__OpenBSD__)
>>>     xf86SbusProbe();
>>> #endif
>>> #ifdef XSERVER_PLATFORM_BUS
>>>     xf86platformPrimary();
>>> #endif
>>> }
>>>
>>> A normal server is seat0, so ServerIsNotSeat0() returns False
>>> and we end up calling xf86platformPrimary() at the end of the
>>> function which falls back to the first non-primary dev.
>>>
>>> The only reason this would not work is if xf86PciProbe()
>>> has set primaryBus.type to a value other then BUS_NONE.
>>
>> Sure, that does happen (I've been aware of it, but I haven't realized it
>> as a problem); in this part of the function:
>>
>>     /* If we haven't found a primary device try a different heuristic */
>>     if (primaryBus.type == BUS_NONE && num) {
>>         for (i = 0; i < num; i++) {
>>             uint16_t command;
>>
>>             info = xf86PciVideoInfo[i];
>>             pci_device_cfg_read_u16(info, &command, 4);
>>
>>             if ((command & PCI_CMD_MEM_ENABLE)
>>                 && ((num == 1) || IS_VGA(info->device_class))) {
>>                 if (primaryBus.type == BUS_NONE) {
>>                     primaryBus.type = BUS_PCI;
>>                     primaryBus.id.pci = info;
>>
>> IS_VGA does not apply to virtio-gpu-pci, but num==1 does.
>>
>> Also, virtio-gpu-pci has MMIO decoding enabled in the command register,
>> because virtio-gpu-pci is virtio-1.0 only, and it uses an MMIO BAR for
>> virtio register access and signaling.
>>
>> The corresponding log snippet is
>>
>> [   190.155] (II) xfree86: Adding drm device (/dev/dri/card0)
>> [   389.067] (--) PCI:*(0:0:1:0) 1af4:1050:1af4:1100 rev 1, Mem @
>> 0x10040000/4096, 0x8000000000/8388608, BIOS @ 0x????????/65536
>>
>> Note the "*" in "PCI:*" -- that comes from
>>
>>     /* Print a summary of the video devices found */
>>     for (k = 0; k < num; k++) {
>>         const char *prim = " ";
>>         Bool memdone = FALSE, iodone = FALSE;
>>
>>         info = xf86PciVideoInfo[k];
>>
>>         if (!PCIALWAYSPRINTCLASSES(info->device_class))
>>             continue;
>>
>>         if (xf86IsPrimaryPci(info))
>>             prim = "*";
>>
>>         xf86Msg(X_PROBED, "PCI:%s(%u:%u:%u:%u) %04x:%04x:%04x:%04x ",
>> prim,
>>                 info->domain, info->bus, info->dev, info->func,
>>                 info->vendor_id, info->device_id,
>>                 info->subvendor_id, info->subdevice_id);
>>
>>> So I think we need to figure out why the existing fallback
>>> path is not working.
>>
>> It's because the num==1 part makes xf86PciProbe() assign BUS_PCI to
>> primaryBus.type.
>>
>> Here's a more complete call tree:
>>
>>   InitOutput()
>>
>>     xf86BusProbe()
>>       xf86platformProbe()
>>       xf86PciProbe()
>>         primaryBus.type := BUS_PCI /* due to num == 1 */
>>       xf86platformPrimary()
>>         /* does nothing */
>>
>>     xf86BusConfig()
>>       xf86CallDriverProbe()
>>         foundScreen := xf86platformProbeDev()
>>           /* xf86PlatformDeviceCheckBusID() is never called because
>>            * the user never specified Device.BusID */
>>
>>           /* xf86IsPrimaryPlatform() does not match
>>            * because primaryBus.type == BUS_PCI */
>>
>>           /* autoAddGPU is TRUE, so we call: */
>>           probeSingleDevice(... PLATFORM_PROBE_GPU_SCREEN ...)
>>           /* and because that succeeds, we set: */
>>           foundScreen := TRUE
>>
>>       /* the libpciaccess branch is not reached
>>        * because foundScreen is TRUE from xf86platformProbeDev() */
>>
>> I looked into why xf86IsPrimaryPlatform() wouldn't match this device in
>> xf86platformProbeDev(). Here's my notes from earlier, standing in
>> xf86IsPrimaryPlatform():
>>
>>       - primaryBus.type == BUS_PCI
>>
>>       - primaryBus.id.pci == 0x671600 (pointer to "struct pci_device")
>>
>>       - xf86_platform_devices[0].attribs->busid == "pci:0000:00:01.0"
>>         (good, it matches the location of the virtio-gpu-pci device)
>>
>>       - xf86_platform_devices[0].pdev == 0x670d80 (pointer to a
>>         different "struct pci_device" object)
>>
>>       - however, the *contents* of "*primaryBus.id.pci" and
>>         "*xf86_platform_devices[0].pdev" are identical! They both
>>         describe the "virtio-gpu-pci" device.
>>
>>       This means that xf86IsPrimaryPlatform() returns false despite
>>       "primaryBus" describing the exact same PCI device as
>>       "xf86_platform_devices[0]".
> 
> Ah, see now we're getting somewhere, so maybe we just need to
> teach xf86IsPrimaryPlatform to also accept devices marked
> as primary by the xf86PciProbe() iow, make it deal with
> primaryBus.type == BUS_PCI and in that case compare the
> pci busid and if they match return TRUE ?

Nota bene: this is *exactly* what xf86IsPrimaryPci() does, as far as I can see, just approaching it from the "other side". Compare:

static Bool
xf86IsPrimaryPlatform(struct xf86_platform_device *plat)
{
    return ((primaryBus.type == BUS_PLATFORM) && (plat == primaryBus.id.plat));
}

versus

Bool
xf86IsPrimaryPci(struct pci_device *pPci)
{
    if (primaryBus.type == BUS_PCI)
        return pPci == primaryBus.id.pci;
#ifdef XSERVER_PLATFORM_BUS
    if (primaryBus.type == BUS_PLATFORM)
        if (primaryBus.id.plat->pdev)
            if (MATCH_PCI_DEVICES(primaryBus.id.plat->pdev, pPci))
                return TRUE;
#endif
    return FALSE;
}

So xf86IsPrimaryPci() accepts a platform device as primary PCI device if the "deep description" matches, but xf86IsPrimaryPlatform() does not implement the "vice versa".

I didn't recommend extending xf86IsPrimaryPlatform() like this (that is, to practically unify the two "is primary?" functions) because I found it extremely confusing that the two separate enumeration paths (platform *and* PCI) would *both* have to "look the other way" for determining a primary. I mean, what sense does it make to accept a BUS_PLATFORM device as primary PCI device, but then *also* accept a BUS_PCI device as primary platform device? One of those should really be unnecessary, dependent on the order between the enumerations.

> 
> I think that should fix your problem and it matches the intend
> of how things are supposed to work.

Yes, unifying xf86IsPrimaryPlatform() and xf86IsPrimaryPci() would likely work (both including the same "deep comparison"), but as I said above, I can't fathom why "looking the other way" would be necessary in both places.

Thanks
Laszlo


More information about the xorg-devel mailing list