[Intel-gfx] [PATCH] drm/i915: Rework GPU reset sequence to match driver load & thaw
Daniel Vetter
daniel at ffwll.ch
Tue Jul 29 12:32:42 CEST 2014
On Tue, Jul 29, 2014 at 08:36:33AM +0100, Chris Wilson wrote:
> On Mon, Jul 28, 2014 at 11:26:38AM +0200, Daniel Vetter wrote:
> > Oh, I guess that's the tricky bit why the old approach never worked -
> > because reset_in_progress is set we failed the context/ppgtt loading
> > through the rings and screwed up.
> >
> > Problem with your approach is that we want to bail out here if a reset is
> > in progress, so we can't just eat the EAGAIN. If we do that we potentially
> > deadlock or overflow the ring.
> >
> > I think we need a different hack here, and a few layers down (i.e. at the
> > place where we actually generate that offending -EAGAIN).
> >
> > - Around the re-init sequence in the reset function we set
> > dev_priv->mm.reload_in_reset or similar. Since we hold dev->struct_mutex
> > no one will see that, as long as we never leak it out of the critical
> > section.
> >
> > - In the ring_begin code that checks for gpu hangs we ignore
> > reset_in_progress if this bit is set.
> >
> > - Both places need fairly big comments to explain what exactly is going
> > on.
>
> This is going from bad to worse. I think you can do better if you looked
> at the problem afresh.
Well we can't really reset reset_in_progress at that point, since not all
reset is done yet. Especially the modeset stuff. So I don't think that
reordering the reset sequence would get us out of this ugly spot. And I
don't see any other solution really. Do you?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list