blocking ops in drm_sched_cleanup_jobs()

Mon Sep 16 08:11:03 UTC 2019

Hi Steven,

the problem seems to be than panfrost is trying to sleep while freeing a 
job. E.g. it tries to take a mutex.

That is not allowed any more since we need to free the jobs from atomic 
and even interrupt context.

Your suggestion wouldn't work because this way jobs are not freed when 
there isn't a new one to be scheduled.

Regards,
Christian.

Am 13.09.19 um 16:50 schrieb Steven Price:
> Hi,
>
> I hit the below splat randomly with panfrost. From what I can tell this
> is a more general issue which would affect other drivers.
>
> ----8<-----
> [58604.913130] ------------[ cut here ]------------
> [58604.918590] WARNING: CPU: 1 PID: 1758 at kernel/sched/core.c:6556 __might_sleep+0x74/0x98
> [58604.927965] do not call blocking ops when !TASK_RUNNING; state=1 set at [<0c590494>] prepare_to_wait_event+0x104/0x164
> [58604.940047] Modules linked in: panfrost gpu_sched
> [58604.945370] CPU: 1 PID: 1758 Comm: pan_js Not tainted 5.3.0-rc1+ #13
> [58604.952500] Hardware name: Rockchip (Device Tree)
> [58604.957815] [<c0111150>] (unwind_backtrace) from [<c010c99c>] (show_stack+0x10/0x14)
> [58604.966521] [<c010c99c>] (show_stack) from [<c07adbb4>] (dump_stack+0x9c/0xd4)
> [58604.974639] [<c07adbb4>] (dump_stack) from [<c0121da8>] (__warn+0xe8/0x104)
> [58604.982462] [<c0121da8>] (__warn) from [<c0121e08>] (warn_slowpath_fmt+0x44/0x6c)
> [58604.990867] [<c0121e08>] (warn_slowpath_fmt) from [<c014eccc>] (__might_sleep+0x74/0x98)
> [58604.999973] [<c014eccc>] (__might_sleep) from [<c07c73d8>] (__mutex_lock+0x38/0x948)
> [58605.008690] [<c07c73d8>] (__mutex_lock) from [<c07c7d00>] (mutex_lock_nested+0x18/0x20)
> [58605.017841] [<c07c7d00>] (mutex_lock_nested) from [<bf00b54c>] (panfrost_gem_free_object+0x60/0x10c [panfrost])
> [58605.029430] [<bf00b54c>] (panfrost_gem_free_object [panfrost]) from [<bf00cecc>] (panfrost_job_put+0x138/0x150 [panfrost])
> [58605.042076] [<bf00cecc>] (panfrost_job_put [panfrost]) from [<bf00121c>] (drm_sched_cleanup_jobs+0xc8/0xe0 [gpu_sched])
> [58605.054417] [<bf00121c>] (drm_sched_cleanup_jobs [gpu_sched]) from [<bf001300>] (drm_sched_main+0xcc/0x26c [gpu_sched])
> [58605.066620] [<bf001300>] (drm_sched_main [gpu_sched]) from [<c0146cfc>] (kthread+0x13c/0x154)
> [58605.076226] [<c0146cfc>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> [58605.084346] Exception stack(0xe959bfb0 to 0xe959bff8)
> [58605.090046] bfa0:                                     00000000 00000000 00000000 00000000
> [58605.099250] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [58605.108480] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [58605.116210] irq event stamp: 179
> [58605.119955] hardirqs last  enabled at (187): [<c017f7e4>] console_unlock+0x564/0x5c4
> [58605.128935] hardirqs last disabled at (202): [<c017f308>] console_unlock+0x88/0x5c4
> [58605.137788] softirqs last  enabled at (216): [<c0102334>] __do_softirq+0x18c/0x548
> [58605.146543] softirqs last disabled at (227): [<c0129528>] irq_exit+0xc4/0x10c
> [58605.154618] ---[ end trace f65bdbd9ea9adfc0 ]---
> ----8<-----
>
> The problem is that drm_sched_main() calls drm_sched_cleanup_jobs() as
> part of the condition of wait_event_interruptible:
>
>> 		wait_event_interruptible(sched->wake_up_worker,
>> 					 (drm_sched_cleanup_jobs(sched),
>> 					 (!drm_sched_blocked(sched) &&
>> 					  (entity = drm_sched_select_entity(sched))) ||
>> 					 kthread_should_stop()));
> When drm_sched_cleanup_jobs() is called *after* a wait (i.e. after
> prepare_to_wait_event() has been called), then any might_sleep() will
> moan loudly about it. This doesn't seem to happen often (I've only
> triggered it once) because usually drm_sched_cleanup_jobs() either
> doesn't sleep or does the sleeping during the first call that
> wait_event_interruptible() makes (which is before the task state is set).
>
> I don't really understand why drm_sched_cleanup_jobs() needs to be
> called here, a simple change like below 'fixes' it. But I presume
> there's some reason for the call being part of the
> wait_event_interruptible condition. Can anyone shed light on this?
>
> The code was introduced in commit 5918045c4ed4 ("drm/scheduler: rework job destruction")
>
> Steve
>
> ----8<-----
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 9a0ee74d82dc..528f295e3a31 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -699,11 +699,12 @@ static int drm_sched_main(void *param)
>   		struct drm_sched_job *sched_job;
>   		struct dma_fence *fence;
>   
> +		drm_sched_cleanup_jobs(sched);
> +
>   		wait_event_interruptible(sched->wake_up_worker,
> -					 (drm_sched_cleanup_jobs(sched),
>   					 (!drm_sched_blocked(sched) &&
>   					  (entity = drm_sched_select_entity(sched))) ||
> -					 kthread_should_stop()));
> +					 kthread_should_stop());
>   
>   		if (!entity)
>   			continue;