[Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time

suijingfeng suijingfeng at loongson.cn
Wed Sep 6 09:08:10 UTC 2023


Hi,


On 2023/9/6 14:45, Christian König wrote:
> Am 05.09.23 um 15:30 schrieb suijingfeng:
>> Hi,
>>
>>
>> On 2023/9/5 18:45, Thomas Zimmermann wrote:
>>> Hi
>>>
>>> Am 04.09.23 um 21:57 schrieb Sui Jingfeng:
>>>> From: Sui Jingfeng <suijingfeng at loongson.cn>
>>>>
>>>> On a machine with multiple GPUs, a Linux user has no control over 
>>>> which
>>>> one is primary at boot time. This series tries to solve above 
>>>> mentioned
>>>
>>> If anything, the primary graphics adapter is the one initialized by 
>>> the firmware. I think our boot-up graphics also make this assumption 
>>> implicitly.
>>>
>>
>> Yes, but by the time of DRM drivers get loaded successfully,the 
>> boot-up graphics already finished.
>
> This is an incorrect assumption.
>
> drm_aperture_remove_conflicting_pci_framebuffers() and co don't kill 
> the framebuffer, 

Well, my original description to this technique point is that

1) "Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings)"
2) "By the time of DRM drivers get loaded successfully, the boot-up graphics already finished."

The word "killed" here is rough and coarse description about
how does the drm device driver take over the firmware framebuffer.
Since there seems have something obscure our communication,
lets make the things clear. See below for more elaborate description.


> they just remove the current framebuffer driver to avoid further updates.
>
This statement doesn't sound right, for UEFI environment,
a correct description is that they remove the platform device, not the framebuffer driver.
For the machines with the UEFI firmware, framebuffer driver here definitely refer to the efifb.
The efifb still reside in the system(linux kernel).

Please see the aperture_detach_platform_device() function in video/aperture.c

> So what happens (at least for amdgpu) is that we take over the 
> framebuffer,

This statement here is also not an accurate description.

Strictly speaking, drm/amdgpu takes over the device (the VRAM hardware),
not the framebuffer.

The word "take over" here is also dubious, because drm/amdgpu takes over nothing.

 From the perspective of device-driver model, the GPU hardware *belongs* to the amdgpu drivers.
Why you need to take over a thing originally and belong to you?

If you could build the drm/amdgpu into the kernel and make it get loaded
before the efifb. Then, there no need to use the firmware framebuffer (
the talking is limited to the display boot graphics purpose here).
On such a case, the so-called "take over" will not happen.

The truth is that the efifb create a platform device, which *occupy*
part of the VRAM hardware resource. Thus, the efifb and the drm/amdgpu
form the conflict. There are conflict because they share the same
hardware resource. It is the hardware resources(address ranges) used
by two different driver are conflict. Not the efifb driver itself
conflict with drm/amdgpu driver.

Thus, drm_aperture_remove_conflicting_xxxxxx() function have to kill
one of the device are conflicting. Not to kill the driver. Therefore,
the correct word would be the "reclaim".
drm/amdgpu *reclaim* the hardware resource (vram address range) originally belong to you.

The modeset state (including the framebuffer content) still reside in the amdgpu device.
You just get the dirty framebuffer image in the framebuffer object.
But the framebuffer object already dirty since it in the UEFI firmware stage.

In conclusion, *reclaim* is more accurate than the "take over".
And as far as I'm understanding, the drm/amdgpu take over nothing, no gains.

Well, welcome to correct me if I'm wrong.



More information about the amd-gfx mailing list