[Intel-gfx] [PATCH 7/8] drm/i915: Grab execlist spinlock to avoid post-reset concurrency issues.
Daniel Vetter
daniel at ffwll.ch
Tue Oct 13 06:46:00 PDT 2015
On Tue, Oct 13, 2015 at 12:45:58PM +0100, Chris Wilson wrote:
> On Tue, Oct 13, 2015 at 01:46:38PM +0200, Daniel Vetter wrote:
> > On Fri, Oct 09, 2015 at 09:45:16AM +0100, Chris Wilson wrote:
> > > On Fri, Oct 09, 2015 at 10:38:18AM +0200, Daniel Vetter wrote:
> > > > On Thu, Oct 08, 2015 at 07:31:39PM +0100, Tomas Elf wrote:
> > > > > Grab execlist lock when cleaning up execlist queues after GPU reset to avoid
> > > > > concurrency problems between the context event interrupt handler and the reset
> > > > > path immediately following a GPU reset.
> > > > >
> > > > > Signed-off-by: Tomas Elf <tomas.elf at intel.com>
> > > >
> > > > Should we instead just stop any irqs from the GT completely while we do
> > > > the reset (plus a synchronize_irq)? It's a bit heavy-weight, but probably
> > > > safer. Well not the entire GT, but all the rings under reset (as prep for
> > > > per-ring reset).
> > >
> > > Bah, stopping IRQs is not enough for error state capture though since
> > > requests complete asynchronously just by polling a memory address. (If
> > > that is the goal here, this patch just makes execlist_queue access
> > > consistent and should only be run once the GPU has been reset and so is
> > > categorically idle.)
> >
> > This is the execlist ELSP tracking, which is execlusively driven by the
> > CTX_SWITCH interrupt signal from each engine.
> >
> > At least that's been my assumption, and under that assumption I think
> > stalling interrupts should be good enough.
>
> No, because the requests and vma are not coupled to the interrupt in
> terms of when they can disappear.
At least today execlist keeps its own reference on requests until the
CTX_SWITCH stuff is done to exactly make sure this is the case. And even
when we have that fixed up I think we do need to exclude this processing
somehow, and the irqsave spinlock seems ok for that. disabling the
interrupt itself plus synchronize_irq was really just an idea.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list