[PATCH] drm/amd: Fix random crashes due to bad kfree

Rafael J. Wysocki rafael at kernel.org
Tue Jan 7 18:06:33 UTC 2025


CC Hans who's been dealing with the acpi_video code for some time.

On Thu, Dec 26, 2024 at 2:27 AM Chris Bainbridge
<chris.bainbridge at gmail.com> wrote:
>
> On Thu, Dec 26, 2024 at 12:19:02AM +0100, Tobias Jakobi wrote:
> > Hi Chris!
> >
> > On 12/26/24 00:09, Chris Bainbridge wrote:
> >
> > > Commit c6a837088bed ("drm/amd/display: Fetch the EDID from _DDC if
> > > available for eDP") added function dm_helpers_probe_acpi_edid, which
> > > fetches the EDID from the BIOS by calling acpi_video_get_edid.
> > > acpi_video_get_edid returns a pointer to the EDID, but this pointer does
> > > not originate from kmalloc - it is actually the internal "pointer" field
> > > from an acpi_buffer struct (which did come from kmalloc).
> > > dm_helpers_probe_acpi_edid then attempts to kfree the EDID pointer,
> > > resulting in memory corruption which leads to random, intermittent
> > > crashes (e.g. 4% of boots will fail with some Oops).
> > >
> > > Fix this by allocating a new array (which can be safely freed) for the
> > > EDID data in acpi_video_get_edid, and correctly freeing the acpi_buffer.
> >
> > Hmm, maybe I'm missing something here. But shouldn't it suffice to just
> > remove the kfree call in dm_helpers_probe_acpi_edid()?
>
> Yes, that would work to fix the bad kfree, but there would be a small
> memory leak of the acpi_buffer struct. It's not a huge problem since
> this code is rarely run, and the Nouveau code has never tried to free
> the edid buffer and apparently nobody noticed, but it would be better to
> do the correct thing.
>
> One other curiosity is this comment in the code that allocates the
> memory:
>
> case ACPI_ALLOCATE_BUFFER:
>         /*
>          * Allocate a new buffer. We directectly call acpi_os_allocate here to
>          * purposefully bypass the (optionally enabled) internal allocation
>          * tracking mechanism since we only want to track internal
>          * allocations. Note: The caller should use acpi_os_free to free this
>          * buffer created via ACPI_ALLOCATE_BUFFER.
>          */
>
> Which makes me wonder if all the calls to kfree on acpi_buffer structs
> with ACPI_ALLOCATE_BUFFER in acpi_video.c should actually be calls to
> acpi_os_free instead? I used kfree just for consistency with the
> existing code.

In Linux, acpi_os_free() is a wrapper around kfree(), so it doesn't
matter in practice which one is used, although in principle you are
right.

So Hans points out that you should check the kmemdup() return value
against NULL because returning a nonzero length along with a NULL
buffer pointer is confusing at best.

But if you want to kmemdup() the buffer, it is better to additionally
check if the length of it really equals to the 'length' value.

And I would put the entire buffer management into
acpi_video_device_EDID() including the kmemdup() and make it return
the actual length of the buffer on success.


More information about the amd-gfx mailing list