[PATCH 2/5] kernel.h: Add non_block_start/end()

Michal Hocko mhocko at kernel.org
Fri Aug 16 08:24:28 UTC 2019

On Thu 15-08-19 15:15:09, Andrew Morton wrote:
> On Thu, 15 Aug 2019 10:44:29 +0200 Michal Hocko <mhocko at kernel.org> wrote:
> > > I continue to struggle with this.  It introduces a new kernel state
> > > "running preemptibly but must not call schedule()".  How does this make
> > > any sense?
> > > 
> > > Perhaps a much, much more detailed description of the oom_reaper
> > > situation would help out.
> >  
> > The primary point here is that there is a demand of non blockable mmu
> > notifiers to be called when the oom reaper tears down the address space.
> > As the oom reaper is the primary guarantee of the oom handling forward
> > progress it cannot be blocked on anything that might depend on blockable
> > memory allocations. These are not really easy to track because they
> > might be indirect - e.g. notifier blocks on a lock which other context
> > holds while allocating memory or waiting for a flusher that needs memory
> > to perform its work. If such a blocking state happens that we can end up
> > in a silent hang with an unusable machine.
> > Now we hope for reasonable implementations of mmu notifiers (strong
> > words I know ;) and this should be relatively simple and effective catch
> > all tool to detect something suspicious is going on.
> > 
> > Does that make the situation more clear?
> Yes, thanks, much.  Maybe a code comment along the lines of
>   This is on behalf of the oom reaper, specifically when it is
>   calling the mmu notifiers.  The problem is that if the notifier were
>   to block on, for example, mutex_lock() and if the process which holds
>   that mutex were to perform a sleeping memory allocation, the oom
>   reaper is now blocked on completion of that memory allocation.

    reaper is now blocked on completion of that memory allocation
    and cannot provide the guarantee of the OOM forward progress.

> btw, do we need task_struct.non_block_count?  Perhaps the oom reaper
> thread could set a new PF_NONBLOCK (which would be more general than
> PF_OOM_REAPER).  If we run out of PF_ flags, use (current == oom_reaper_th).

Well, I do not have a strong opinion here. A simple check for the value
seems to be trivial. There are quite some holes in task_struct to hide
this counter without increasing the size.
Michal Hocko

More information about the dri-devel mailing list