[Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path

Wed Nov 27 07:48:09 CET 2013

On Wed, Nov 27, 2013 at 5:23 AM, Ben Widawsky <ben at bwidawsk.net> wrote:
> On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote:
>> If we end up calling the shrinker, which in turn requires the OOM
>> killer, we may end up infinitely waiting for a process to die if the OOM
>> chooses. The case that this prevents occurs in execbuf. The forked
>> variants of gem_evict_everything is a good way to hit it. This is
>> exacerbated by Daniel's recent patch to give OOM precedence to the GEM
>> tests.
>>
>> It's a twisted form of a deadlock.
>>
>> What occurs is the following (assume just 2 procs)
>> 1. proc A gets to execbuf while out of memory, gets struct_mutex.
>> 2. OOM killer comes in and chooses proc B
>> 3. proc B closes it's fds, which requires struct mutex, blocks
>> 4, OOM killer waits for B to die before killing another process (this
>> part is speculative)
>>
>> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>
> I'd still like to know if I am crazy, but I'm now trying to defer the
> stuff we do on file close without using any allocs. Just an update...

Sound's intrigueing, but tbh I don't really have clue about things.
What about adding the relevant stuck task backtraces to the patch and
submitting this to a wider audience (lkml, mm-devel) as an akpm-probe?
The more botched the patch, the better the probe usually.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch