[PATCH] drm/amd: Fix random crashes due to bad kfree
Tobias Jakobi
tjakobi at math.uni-bielefeld.de
Thu Dec 26 12:29:15 UTC 2024
On 12/26/24 02:27, Chris Bainbridge wrote:
> On Thu, Dec 26, 2024 at 12:19:02AM +0100, Tobias Jakobi wrote:
>> Hi Chris!
>>
>> On 12/26/24 00:09, Chris Bainbridge wrote:
>>
>>> Commit c6a837088bed ("drm/amd/display: Fetch the EDID from _DDC if
>>> available for eDP") added function dm_helpers_probe_acpi_edid, which
>>> fetches the EDID from the BIOS by calling acpi_video_get_edid.
>>> acpi_video_get_edid returns a pointer to the EDID, but this pointer does
>>> not originate from kmalloc - it is actually the internal "pointer" field
>>> from an acpi_buffer struct (which did come from kmalloc).
>>> dm_helpers_probe_acpi_edid then attempts to kfree the EDID pointer,
>>> resulting in memory corruption which leads to random, intermittent
>>> crashes (e.g. 4% of boots will fail with some Oops).
>>>
>>> Fix this by allocating a new array (which can be safely freed) for the
>>> EDID data in acpi_video_get_edid, and correctly freeing the acpi_buffer.
>> Hmm, maybe I'm missing something here. But shouldn't it suffice to just
>> remove the kfree call in dm_helpers_probe_acpi_edid()?
> Yes, that would work to fix the bad kfree, but there would be a small
> memory leak of the acpi_buffer struct. It's not a huge problem since
> this code is rarely run, and the Nouveau code has never tried to free
> the edid buffer and apparently nobody noticed, but it would be better to
> do the correct thing.
OK, thanks for explaining. I didn't immediately understand that
something was leaking memory. Only that we were freeing something that
we are not supposed to free.
> One other curiosity is this comment in the code that allocates the
> memory:
>
> case ACPI_ALLOCATE_BUFFER:
> /*
> * Allocate a new buffer. We directectly call acpi_os_allocate here to
> * purposefully bypass the (optionally enabled) internal allocation
> * tracking mechanism since we only want to track internal
> * allocations. Note: The caller should use acpi_os_free to free this
> * buffer created via ACPI_ALLOCATE_BUFFER.
> */
>
> Which makes me wonder if all the calls to kfree on acpi_buffer structs
> with ACPI_ALLOCATE_BUFFER in acpi_video.c should actually be calls to
> acpi_os_free instead? I used kfree just for consistency with the
> existing code.
Wouldn't it make more sense to do the memdup handling in
acpi_video_device_EDID()? This way you have both alloc and free in the
same function. But I'm no expert when it comes to the ACPI kernel code.
Just my two cents :-D
With best wishes,
Tobias
More information about the amd-gfx
mailing list