amdgpu didn't start with pci=nocrs parameter, get error "Fatal error during GPU init"
Mikhail Gavrilov
mikhail.v.gavrilov at gmail.com
Fri Dec 15 11:45:47 UTC 2023
On Tue, Feb 28, 2023 at 5:43 PM Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> The point is it doesn't need to talk to the amdgpu hardware. What it
> does is that it talks to the good old VGA/VESA emulation and that just
> happens to be still enabled by the BIOS/GRUB.
>
> And that VGA/VESA emulation doesn't need any BAR or whatever to keep the
> hw running in the state where it was initialized before the kernel
> started. The kernel just grabs the addresses where it needs to write the
> display data and keeps going with that.
>
> But when a hw specific driver wants to load this is the first thing
> which gets disabled because we need to load new firmware. And with the
> BARs disabled this can't be re-enabled without rebooting the system.
>
> > My suggestion is that if
> > amdgpu fails to talk to the hardware, then let another suitable driver
> > do it. I attached a system log when I apply "pci=nocrs" with
> > "modprobe.blacklist=amdgpu" for showing that graphics work right in
> > this case.
> > To do this, does the Linux module loading mechanism need to be refined?
>
> That's actually working as expected. The real problem is that the BIOS
> on that system is so broken that we can't access the hw correctly.
>
> What we could to do is to check the BARs very early on and refuse to
> load when they are disable. The problem with this approach is that there
> are systems where it is normal that the BARs are disable until the
> driver loads and get enabled during the hardware initialization process.
>
> What you might want to look into is to find a quirk for the BIOS to
> properly enable the nvme controller.
>
That's interesting. I noticed that now amdgpu could work even with
parameter [pci=nocrs] on 6.7.0-0.rc4 and higher kernels.
It means BARs became available?
I attached here the kerner log and lspci. What's changed?
--
Best Regards,
Mike Gavrilov.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg-nvme-down-2.zip
Type: application/zip
Size: 46571 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20231215/e3ad9513/attachment-0002.zip>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lspci.zip
Type: application/zip
Size: 2710 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20231215/e3ad9513/attachment-0003.zip>
More information about the amd-gfx
mailing list