[PATCH] drm/amd/display: use GFP_ATOMIC for bounding box

Christian König ckoenig.leichtzumerken at gmail.com
Wed Jun 5 09:06:02 UTC 2024


Am 04.06.24 um 16:57 schrieb Arnd Bergmann:
> On Tue, Jun 4, 2024, at 16:22, Christian König wrote:
>> Am 04.06.24 um 15:50 schrieb Alex Deucher:
>>> This can be called in atomic context.  Should fix:
>>>
>>> BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
>>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 449, name: kworker/u64:8
>>> preempt_count: 2, expected: 0
>>> RCU nest depth: 0, expected: 0
>>> Preemption disabled at:
>>> ffffffffc0ce1580>] dc_fpu_begin+0x30/0xd0 [amdgpu]
>>> CPU: 5 PID: 449 Comm: kworker/u64:8 Tainted: G        W          6.8.0+ #35
>>> Hardware name: System manufacturer System Product Name/ROG STRIX X570-E GAMING WIFI II, BIOS 4204 02/24/2022
>>> Workqueue: events_unbound async_run_entry_fn
>> That most likely only papers over the real problem and is not a valid fix.
>>
>> The question is why is that an atomic context?  If the function is used
>> under a spinlock then this might indeed be the right fix.
>>
>> If it's because of floating point operation then that here won't work
>> either.
> It looks like there is only one caller, and this turns on
> floating point instructions just for the call:
>
>          if (dc->res_pool->funcs->update_bw_bounding_box) {
>                  DC_FP_START();
>                  dc->res_pool->funcs->update_bw_bounding_box(dc, dc->clk_mgr->bw_params);
>                  DC_FP_END();
>          }
>
> but then they are enabled again inside of the function.
>
> If we can drop the outer DC_FP_START(), that means
> the GFP_KERNEL allocation works. On the other hand if
> we actually have to enabled it before calling into
> the function (e.g. because there is an architecture that
> has incompatible function calling conventions when FP
> is enabled), the inner one is redundant, but we can
> potentially move the kmemdup() into the caller and
> pass the copy by refernence.

Yeah exactly that's the case.

The DC_FP_START() and DC_FP_END() calls need to be outside of the 
function  because the compiler has no idea that it can't move any 
flouting point instructions outside of the critical section between the 
two functions.

So yes the calls to DC_FP_START() and DC_FP_END() from within floating 
point enabled code should be forbidden somehow.

And when that is done the caller should allocate any parameters needed 
and pass them by reference to avoid the GFP_ATOMIC.

Regards,
Christian.

>
>        Arnd



More information about the amd-gfx mailing list