[Intel-gfx] [PATCH 1/4] drm/i915: relative_constants_mode race fix

Ben Widawsky ben at bwidawsk.net
Thu Oct 13 19:25:51 CEST 2011


On Thu, Oct 13, 2011 at 10:35:23AM +0200, Daniel Vetter wrote:
> On Wed, Oct 12, 2011 at 03:56:19PM -0700, Ben Widawsky wrote:
> > After my refactoring, Chris noticed that we had a bug.
> > 
> > dev_priv keeps track of the current addressing mode that gets set at
> > execbuffer time. Unfortunately the existing code was doing this before
> > acquiring struct_mutex which leaves a race with another thread also
> > doing an execbuffer. If that wasn't bad enough, relocate_slow drops
> > struct_mutex which opens a much more likely error where another thread
> > comes in and modifies the state while relocate_slow is being slow.
> > 
> > The solution here is to just defer setting this state until we
> > absolutely need it, and we know we'll have struct_mutex for the
> > remainder of our code path.
> > 
> > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   67 ++++++++++++++--------------
> >  1 files changed, 34 insertions(+), 33 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 3693e83..0b343af 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -1003,39 +1003,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  		return -EINVAL;
> >  	}
> >  
> > -	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> > -	switch (mode) {
> > -	case I915_EXEC_CONSTANTS_REL_GENERAL:
> > -	case I915_EXEC_CONSTANTS_ABSOLUTE:
> > -	case I915_EXEC_CONSTANTS_REL_SURFACE:
> > -		if (ring == &dev_priv->ring[RCS] &&
> > -		    mode != dev_priv->relative_constants_mode) {
> > -			if (INTEL_INFO(dev)->gen < 4)
> > -				return -EINVAL;
> > -
> > -			if (INTEL_INFO(dev)->gen > 5 &&
> > -			    mode == I915_EXEC_CONSTANTS_REL_SURFACE)
> > -				return -EINVAL;
> > -
> > -			ret = intel_ring_begin(ring, 4);
> > -			if (ret)
> > -				return ret;
> > -
> > -			intel_ring_emit(ring, MI_NOOP);
> > -			intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> > -			intel_ring_emit(ring, INSTPM);
> > -			intel_ring_emit(ring,
> > -					I915_EXEC_CONSTANTS_MASK << 16 | mode);
> > -			intel_ring_advance(ring);
> > -
> > -			dev_priv->relative_constants_mode = mode;
> > -		}
> > -		break;
> > -	default:
> > -		DRM_ERROR("execbuf with unknown constants: %d\n", mode);
> > -		return -EINVAL;
> > -	}
> > -
> >  	if (args->buffer_count < 1) {
> >  		DRM_ERROR("execbuf with %d buffers\n", args->buffer_count);
> >  		return -EINVAL;
> > @@ -1132,6 +1099,40 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  			goto err;
> >  	}
> >  
> > +	mode = args->flags & I915_EXEC_CONSTANTS_MASK;
> > +	switch (mode) {
> > +	case I915_EXEC_CONSTANTS_REL_GENERAL:
> > +	case I915_EXEC_CONSTANTS_ABSOLUTE:
> > +	case I915_EXEC_CONSTANTS_REL_SURFACE:
> > +		if (ring == &dev_priv->ring[RCS] &&
> > +		    mode != dev_priv->relative_constants_mode) {
> > +			if (INTEL_INFO(dev)->gen < 4)
> > +				return -EINVAL;
> > +
> > +			if (INTEL_INFO(dev)->gen > 5 &&
> > +			    mode == I915_EXEC_CONSTANTS_REL_SURFACE)
> > +				return -EINVAL;
> > +
> > +			ret = intel_ring_begin(ring, 4);
> > +			if (ret)
> > +				goto err;
> > +
> > +			intel_ring_emit(ring, MI_NOOP);
> > +			intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> > +			intel_ring_emit(ring, INSTPM);
> > +			intel_ring_emit(ring,
> > +					I915_EXEC_CONSTANTS_MASK << 16 | mode);
> > +			intel_ring_advance(ring);
> > +
> > +			dev_priv->relative_constants_mode = mode;
> > +		}
> > +		break;
> > +	default:
> > +		DRM_ERROR("execbuf with unknown constants: %d\n", mode);
> > +		ret -EINVAL;
> 
> This probably wont compile ;-) I'd also move this block even further down
> after gem_execbuffer_move_to_gpu. It doesn't matter for correctness, but
> now it's sitting in the middle of the relocation and cache domain
> handling, which doesn't make sense.
> 

Crap, I must have missed the compile test. I did have this further down
originally and forgot why I moved it up here - I think it was something
like, I wanted to set the proper state before actually relocated any
objects in case we decide in the future that some addressing mode
requires relocation restrictions. (So may as well fail early if we don't
set the addressing mode).

I'll move it unless anyone else chimes in.

Ben



More information about the Intel-gfx mailing list