Annoying AMDGPU boot-time warning due to simplefb / amdgpu resource clash
Javier Martinez Canillas
javierm at redhat.com
Mon Jun 27 08:02:13 UTC 2022
Hello Linus,
On 6/26/22 20:54, Linus Torvalds wrote:
> So this has been going on for a while, and it's quite annoying.
>
> At bootup, my main desktop (Threadripper 3970X with radeon graphics)
> now complains about
>
> resource sanity check: requesting [mem 0xd0000000-0xdfffffff], which
> spans more than BOOTFB [mem 0xd0000000-0xd02fffff]
>
> and then gives me a nasty callchain that is basically the amdgpu probe
> sequence ending in amdgpu_bo_init() doing the
> arch_io_reserve_memtype_wc() which is then what complains.
>
> That "BOOTFB" resource is from sysfb_simplefb.c, and I think what
> started triggering this is commit c96898342c38 ("drivers/firmware:
> Don't mark as busy the simple-framebuffer IO resource").
>
> Because it turns out that that removed the IORESOURCE_BUSY, which in
> turn is what makes the resource conflict code complain about it now,
> because
>
> /*
> * if a resource is "BUSY", it's not a hardware resource
> * but a driver mapping of such a resource; we don't want
> * to warn for those; some drivers legitimately map only
> * partial hardware resources. (example: vesafb)
> */
>
> so the issue is that now the resource code - correctly - says "hey,
> you have *two* conflicting driver mappings".
>
> And that commit claims it did it because "which can lead to drivers
> requesting the same memory resource to fail", but - once again - the
> link in the commit that might actually tell more is just one of those
> useless patch submission links again.
>
> So who knows why that commit was actually done, but it's causing annoyance.
>
The flag was dropped because it was causing drivers that requested their
memory resource with pci_request_region() to fail with -EBUSY (e.g: the
vmwgfx driver):
https://www.spinics.net/lists/dri-devel/msg329672.html
> If simplefb is actually still using that frame buffer, it's a problem.
> If it isn't, then maybe that resource should have been released?
>
It's supposed to be released once amdgpu asks for conflicting framebuffers
to be removed calling drm_aperture_remove_conflicting_pci_framebuffers().
I'm not familiar with the amdgpu driver, but maybe that call has to be done
earlier before the arch_io_reserve_memtype_wc() ?
> I really think that commit c96898342c38 is buggy. It talks about "let
> drivers to request it as busy instead", but then it registers a
> resource that isn't actually a proper real resource. It's just a
> random incomplete chunk of the actual real thing, so it will still
> interfere with resource allocation, and in fact now interferes even
> with that "set memtype" because of this valid warning.
>
That registered resource is what the firmware provides for drivers to use
the system framebuffer for scan-out. It's not the real thing, that's true
since a native driver would kick it out (leading to a resource release)
and register the real aperture.
--
Best regards,
Javier Martinez Canillas
Linux Engineering
Red Hat
More information about the amd-gfx
mailing list