[Intel-gfx] [PATCH 3/5] drm/i915: Make wait_for_flips interruptible.

Tue Oct 20 06:07:51 PDT 2015

On Tue, 2015-10-20 at 10:10 +0200, Daniel Vetter wrote:
> On Tue, Oct 20, 2015 at 10:38:41AM +0300, Ander Conselvan De Oliveira wrote:
> > On Mon, 2015-10-19 at 15:30 +0200, Daniel Vetter wrote:
> > > On Mon, Oct 19, 2015 at 04:16:53PM +0300, Ander Conselvan De Oliveira
> > > wrote:
> > > > On Wed, 2015-09-23 at 13:27 +0200, Maarten Lankhorst wrote:
> > > > > @@ -13306,6 +13299,29 @@ intel_prepare_plane_fb(struct drm_plane
> > > > > *plane,
> > > > >  	if (ret)
> > > > >  		return ret;
> > > > >  
> > > > > +	if (old_obj) {
> > > > > +		struct drm_crtc_state *crtc_state =
> > > > > +			drm_atomic_get_existing_crtc_state(new_state
> > > > > ->state,
> > > > > plane->state->crtc);
> > > > > +
> > > > > +		/* Big Hammer, we also need to ensure that any
> > > > > pending
> > > > > +		 * MI_WAIT_FOR_EVENT inside a user batch buffer on
> > > > > the
> > > > > +		 * current scanout is retired before unpinning the
> > > > > old
> > > > > +		 * framebuffer. Note that we rely on userspace
> > > > > rendering
> > > > > +		 * into the buffer attached to the pipe they are
> > > > > waiting
> > > > > +		 * on. If not, userspace generates a GPU hang with
> > > > > IPEHR
> > > > > +		 * point to the MI_WAIT_FOR_EVENT.
> > > > > +		 *
> > > > > +		 * This should only fail upon a hung GPU, in which
> > > > > case
> > > > > we
> > > > > +		 * can safely continue.
> > > > > +		 */
> > > > > +		if (needs_modeset(crtc_state))
> > > > > +			ret = i915_gem_object_wait_rendering(old_obj,
> > > > > true);
> > > > > +
> > > > > +		/* Swallow -EIO errors to allow updates during hw
> > > > > lockup.
> > > > > */
> > > > > +		if (ret && ret != -EIO)
> > > > > +			goto out;
> > > > 
> > > > Doesn't this change the behavior of a modeset after a GPU hang? Since
> > > > mm.interruptible is true, i915_gem_check_wedge() might return -EAGAIN
> > > > instead of
> > > > -EIO. Previously the modeset would continue in that scenario, but now,
> > > > somewhat
> > > > contrary to the comment above, we don't continue and instead pass the 
> > > > -EAGAIN to
> > > > user space.
> > > 
> > > It's "while the gpu hang is pending" not "after", but this change is the
> > > hole point of making pinning interruptible. With current modeset code the
> > > only thing we could hope for is that the reset would go through, and
> > > otherwise we'd have to fail the modeset. Now we can correctly retry the
> > > operation if it has run into a concurrent gpu hang/reset.
> > 
> > So in that case should user space retry the modeset? I don't think it does
> > that
> > at moment, at least weston and xf86-video-intel don't. I'm not sure how big
> > of a
> > deal that is, though, since it is an unlikely corner case.
> 
> Every drm ioctl goes through drmIoctl in libdrm, which does this
> restarting. If not then that's a userspace bug, since this is a core bit
> of the drm ABI design.

Ah, true, completely forgot about that.

With the new 2.9 patch, I don't see any other issues with this. I'm happy to
give my r-b, but I think someone else that knows this stuff a bit better should
at least ack the patch.

Ander