[External] Re: [PATCH 2/2] sched: mark PRINTK_DEFERRED_CONTEXT_MASK in __schedule()

Chengming Zhou zhouchengming at bytedance.com
Mon Sep 28 10:04:23 UTC 2020


在 2020/9/28 下午5:01, Peter Zijlstra 写道:
> On Mon, Sep 28, 2020 at 04:54:53PM +0800, Chengming Zhou wrote:
>> 在 2020/9/28 下午3:32, Peter Zijlstra 写道:
>>> On Mon, Sep 28, 2020 at 12:11:30AM +0800, Chengming Zhou wrote:
>>>> The WARN_ON/WARN_ON_ONCE with rq lock held in __schedule() should be
>>>> deferred by marking the PRINTK_DEFERRED_CONTEXT_MASK, or will cause
>>>> deadlock on rq lock in the printk path.
>>> It also shouldn't happen in the first place, so who bloody cares.
>> Yes, but if our box deadlock just because a WARN_ON_ONCE, we have to
>> reboot : (
> You have to reboot anyway to get into the fixed kernel.

Mostly, WARN_ON_ONCE happened in the perf code on our machines, we actually

don't care too much about the perf function works : )   Looks like we
have to find and

fix that perf bug before go on...

>> So these WARN_ON_ONCE have BUG_ON effect ?  Or we should change to use
>> BUG_ON ?
> Or use a sane printk implementation, I never suffer this.

Well, you are lucky. So it's a problem in our printk implementation.

The deadlock path is:

printk

  vprintk_emit

    console_unlock

      vt_console_print

        hide_cursor

          bit_cursor

            soft_cursor

              queue_work_on

                __queue_work

                  try_to_wake_up

                    _raw_spin_lock

                      native_queued_spin_lock_slowpath

Looks like it's introduced by this commit:

eaa434defaca1781fb2932c685289b610aeb8b4b

"drm/fb-helper: Add fb_deferred_io support"

    This adds deferred io support to drm_fb_helper.
    The fbdev framebuffer changes are flushed using the callback
    (struct drm_framebuffer *)->funcs->dirty() by a dedicated worker
    ensuring that it always runs in process context.
   
    For those wondering why we need to be able to handle atomic calling
    contexts: Both panic paths and cursor code and fbcon blanking can run
    from atomic. See
   
    commit bcb39af4486be07e896fc374a2336bad3104ae0a
    Author: Dave Airlie <airlied at redhat.com>
    Date:   Thu Feb 7 11:19:15 2013 +1000
   
        drm/udl: make usage as a console safer
   
    for where this was originally discovered.



More information about the dri-devel mailing list