[PATCH 7/8] video/aperture: Only remove sysfb on the default vga pci device

Aaron Plattner aplattner at nvidia.com
Tue Apr 4 20:59:33 UTC 2023


On 4/4/23 1:18 PM, Daniel Vetter wrote:
> Instead of calling aperture_remove_conflicting_devices() to remove the
> conflicting devices, just call to aperture_detach_devices() to detach
> the device that matches the same PCI BAR / aperture range. Since the
> former is just a wrapper of the latter plus a sysfb_disable() call,
> and now that's done in this function but only for the primary devices.
> 
> This fixes a regression introduced by ee7a69aa38d8 ("fbdev: Disable
> sysfb device registration when removing conflicting FBs"), where we
> remove the sysfb when loading a driver for an unrelated pci device,
> resulting in the user loosing their efifb console or similar.
> 
> Note that in practice this only is a problem with the nvidia blob,
> because that's the only gpu driver people might install which does not
> come with an fbdev driver of it's own. For everyone else the real gpu
> driver will restore a working console.

It might be worth noting that this also affects devices that have no 
driver installed, or where the driver failed to initialize or was 
configured not to set a mode. E.g. I reproduced this problem on a laptop 
with i915.modeset=0 and an NVIDIA driver that calls 
drm_fbdev_generic_setup. It would also reproduce on a system that sets 
modeset=0 (or has a GPU that's too new for its corresponding kernel 
driver) and that passes an NVIDIA GPU through to a VM using vfio-pci 
since that also calls aperture_remove_conflicting_pci_devices.

I agree that in practice this will mostly affect people with our driver 
until I get my changes to add drm_fbdev_generic_setup checked in. But 
these other cases don't seem all that unlikely to me.

-- Aaron

> Also note that in the referenced bug there's confusion that this same
> bug also happens on amdgpu. But that was just another amdgpu specific
> regression, which just happened to happen at roughly the same time and
> with the same user-observable symptoms. That bug is fixed now, see
> https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15
> 
> Note that we should not have any such issues on non-pci multi-gpu
> issues, because I could only find two such cases:
> - SoC with some external panel over spi or similar. These panel
>    drivers do not use drm_aperture_remove_conflicting_framebuffers(),
>    so no problem.
> - vga+mga, which is a direct console driver and entirely bypasses all
>    this.
> 
> For the above reasons the cc: stable is just notionally, this patch
> will need a backport and that's up to nvidia if they care enough.
> 
> v2:
> - Explain a bit better why other multi-gpu that aren't pci shouldn't
>    have any issues with making all this fully pci specific.
> 
> v3
> - polish commit message (Javier)
> 
> Fixes: ee7a69aa38d8 ("fbdev: Disable sysfb device registration when removing conflicting FBs")
> Tested-by: Aaron Plattner <aplattner at nvidia.com>
> Reviewed-by: Javier Martinez Canillas <javierm at redhat.com>
> References: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> Cc: Aaron Plattner <aplattner at nvidia.com>
> Cc: Javier Martinez Canillas <javierm at redhat.com>
> Cc: Thomas Zimmermann <tzimmermann at suse.de>
> Cc: Helge Deller <deller at gmx.de>
> Cc: Sam Ravnborg <sam at ravnborg.org>
> Cc: Alex Deucher <alexander.deucher at amd.com>
> Cc: <stable at vger.kernel.org> # v5.19+ (if someone else does the backport)
> ---
>   drivers/video/aperture.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
> index 8f1437339e49..2394c2d310f8 100644
> --- a/drivers/video/aperture.c
> +++ b/drivers/video/aperture.c
> @@ -321,15 +321,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
>   
>   	primary = pdev == vga_default_device();
>   
> +	if (primary)
> +		sysfb_disable();
> +
>   	for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
>   		if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
>   			continue;
>   
>   		base = pci_resource_start(pdev, bar);
>   		size = pci_resource_len(pdev, bar);
> -		ret = aperture_remove_conflicting_devices(base, size, name);
> -		if (ret)
> -			return ret;
> +		aperture_detach_devices(base, size);
>   	}
>   
>   	if (primary) {


More information about the dri-devel mailing list