[Intel-gfx] [PATCH] drm/i915: Make vm eviction uninterruptible

Mon Apr 7 23:58:34 CEST 2014

On Mon, Apr 07, 2014 at 11:50:06PM +0200, Daniel Vetter wrote:
> On Mon, Apr 07, 2014 at 11:58:28AM -0700, Ben Widawsky wrote:
> > On Mon, Apr 07, 2014 at 01:30:04PM +0100, Chris Wilson wrote:
> > > On Mon, Apr 07, 2014 at 02:15:00PM +0200, Daniel Vetter wrote:
> > > > On Mon, Apr 07, 2014 at 10:42:56AM +0100, Chris Wilson wrote:
> > > > > On Sun, Apr 06, 2014 at 11:35:03AM -0700, Ben Widawsky wrote:
> > > > > > On Sat, Apr 05, 2014 at 07:45:28PM -0700, Ben Widawsky wrote:
> > > > > > > The issue I was seeing appeared to seeing from sigkill. In such a case,
> > > > > > > the process may want to die before the context/work/address space is
> > > > > > > freeable. For example:
> > > > > > > 1. evict_vm called for whatever reason
> > > > > > > 2. wait_seqno because the VMA is still active
> > > > > > 
> > > > > > hmm something isn't right here. Why did I get to wait_seqno if pin_count
> > > > > > was 0? Just FYI, this wasn't hypothetical. I did trace it all the way to
> > > > > > exactly ERESTARTSYS from wait_seqno.
> > > > > > 
> > > > > > By the way, another option in evict would be:
> > > > > > while(ret = (i915_vma_unbind(vma) == -ERESTARTSYS));
> > > > > > WARN_ON(ret);
> > > > > > 
> > > > > > > 3. receive signal break out of wait_seqno
> > > > > > > 4. return to evict_vm and the above WARN
> > > > > > > 
> > > > > > > Our error handling from there just spirals.
> > > > > > > 
> > > > > > > One issue I have with our current code is I'd really like eviction to
> > > > > > > not be able to fail (obviously extreme cases are unavoidable).
> > > > > 
> > > > > This is unrealistic since we must support X which uses sigtimer.
> > > > > 
> > > > > > > Perhaps
> > > > > > > one other solution would be to make sure the context is idled before
> > > > > > > evicting its VM.
> > > > > 
> > > > > Indeed.
> > > > > 
> > > > > Anyway, I do concur that wrapping i915_driver_preclose() with
> > > > > 
> > > > > dev_priv->mm.interruptible = false;
> > > > > 
> > > > > would make us both happy.
> > > > 
> > > > Isn't the backtrace just fallout from the lifetime rules being a bit
> > > > funny? We didn't uninterruptibly stall for any still active bo when the
> > > > drm fd gets closed, why do we suddenly need to do that with ppgtts? Iirc
> > > > requests hold a ref on the context, contexts hold a ref on the ppgtt and
> > > > so the entire thing should only dissipate once it's really idle.
> > > > 
> > > > Imo just doing uninterruptible sleeps tastes way too much like duct-tape.
> > > > I can be convinced of duct-tape if the tradeoffs really strongly suggests
> > > > it's the right thing (e.g. the shrinker lock stealing, even though we've
> > > > paid a hefty price in accidental complexity with that one), but that needs
> > > > some good justification.
> > > 
> > > Yes, it is duct-tape. But it should be duct-tape against future unknown
> > > bugs (and the currently known bugs) in that the i915_driver_preclose()
> > > cannot report failure and so should not allow its callees to fail (which
> > > is more or less the contract given by .interruptible=false).
> > > 
> > > The alternative is to allow preclose() to support an error-code, which
> > > has the issue that very few programs check for errors during close() and
> > > that EINTR from close() is frowned upon by most.
> > > -Chris
> > > 
> > 
> > Do we have consensus? I am good with Chris' idea. I can write and test
> > the patch.
> 
> If we just disable interrupts we'll never see the bug reports again, which
> is imo bad. And the while loop above will loop a bit too long.
> 
> Adding a new special sleep mode which doesn't interrupt but WARNs badly if
> any sleeping actually happens is also a bit tricky, and pretty much
> guaranteed to blow up. I'm still for a "write testcase, wait for fix to
> materialize" approach. Especially for upstream.
> 
> The testcase is especially important so that we can track it as a full
> ppgtt validation criterion. I'll add this tomorrow as a task to the
> relevant JIRA.
> -Daniel

Blocking important fixes for a test case is harmful to customers of our
software. I won't argue past that. If you won't take it as is, add it to the
JIRA task like you said. I'll carry this one around with my dynamic page table
allocations since you essentially can't do any real workloads with full PPGTT
without this (assuming you have signals). I'd venture to even say existing
tests can hit it with full PPGTT turned on.

The following will hit it on the very first iteration:
#!/bin/bash

while [ 1 ] ;  do
        (glxgears) & pid[0]=$!
        (glxgears) & pid[1]=$!
        (glxgears) & pid[2]=$!
        sleep 3
        kill ${pid[*]}
done

-- 
Ben Widawsky, Intel Open Source Technology Center