[Intel-gfx] [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands

Mon Jun 22 13:12:33 PDT 2015

On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison at Intel.com wrote:
> From: John Harrison <John.C.Harrison at Intel.com>
> 
> It is a bad idea for i915_add_request() to fail. The work will already have been
> send to the ring and will be processed, but there will not be any tracking or
> management of that work.
> 
> The only way the add request call can fail is if it can't write its epilogue
> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> reasons for that are mostly down to running out of ring buffer space and the
> problems associated with trying to get some more. This patch prevents that
> situation from happening in the first place.
> 
> When a request is created, it marks sufficient space as reserved for the
> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> there will be plenty of space for it. Note that a ring_begin() call is required
> to actually reserve the space (and do any potential waiting). However, that is
> not currently done at request creation time. This is because the ring_begin()
> code can allocate a request. Hence calling begin() from the request allocation
> code would lead to infinite recursion! Later patches in this series remove the
> need for begin() to do the allocate. At that point, it becomes safe for the
> allocate to call begin() and really reserve the space.
> 
> Until then, there is a potential for insufficient space to be available at the
> point of calling i915_add_request(). However, that would only be in the case
> where the request was created and immediately submitted without ever calling
> ring_begin() and adding any work to that request. Which should never happen. And
> even if it does, and if that request happens to fall down the tiny window of
> opportunity for failing due to being out of ring space then does it really
> matter because the request wasn't doing anything in the first place?
> 
> v2: Updated the 'reserved space too small' warning to include the offending
> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> re-initialisation of tracking state after a buffer wrap to keep the sanity
> checks accurate.
> 
> v3: Incremented the reserved size to accommodate Ironlake (after finally
> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
> 
> v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
> 
> v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
> Daniel Vetter).

This didn't actually implement what I suggested (wrapping is the worst
case, hence skipping the check for that is breaking the sanity check) and
so changed the patch from "correct, but a bit fragile" to broken. I've
merged the previous version instead.
-Daniel

> 
> For: VIZ-5115
> CC: Tomas Elf <tomas.elf at intel.com>
> CC: Daniel Vetter <daniel at ffwll.ch>
> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |    1 +
>  drivers/gpu/drm/i915/i915_gem.c         |   37 ++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        |   35 +++++++++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   98 +++++++++++++++++++++++++++++--
>  drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++
>  5 files changed, 186 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 0347eb9..eba1857 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
>  
>  int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  			   struct intel_context *ctx);
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  void i915_gem_request_free(struct kref *req_ref);
>  
>  static inline uint32_t
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 81f3512..85fa27b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
>  	} else
>  		ringbuf = ring->buffer;
>  
> +	/*
> +	 * To ensure that this call will not fail, space for its emissions
> +	 * should already have been reserved in the ring buffer. Let the ring
> +	 * know that it is time to use that space up.
> +	 */
> +	intel_ring_reserved_space_use(ringbuf);
> +
>  	request_start = intel_ring_get_tail(ringbuf);
>  	/*
>  	 * Emit any outstanding flushes - execbuf can fail to emit the flush
> @@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
>  			   round_jiffies_up_relative(HZ));
>  	intel_mark_busy(dev_priv->dev);
>  
> +	/* Sanity check that the reserved size was large enough. */
> +	intel_ring_reserved_space_end(ringbuf);
> +
>  	return 0;
>  }
>  
> @@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  	if (ret)
>  		goto err;
>  
> +	/*
> +	 * Reserve space in the ring buffer for all the commands required to
> +	 * eventually emit this request. This is to guarantee that the
> +	 * i915_add_request() call can't fail. Note that the reserve may need
> +	 * to be redone if the request is not actually submitted straight
> +	 * away, e.g. because a GPU scheduler has deferred it.
> +	 *
> +	 * Note further that this call merely notes the reserve request. A
> +	 * subsequent call to *_ring_begin() is required to actually ensure
> +	 * that the reservation is available. Without the begin, if the
> +	 * request creator immediately submitted the request without adding
> +	 * any commands to it then there might not actually be sufficient
> +	 * room for the submission commands. Unfortunately, the current
> +	 * *_ring_begin() implementations potentially call back here to
> +	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
> +	 * infinite recursion! Until that back call path is removed, it is
> +	 * necessary to do a manual _begin() outside.
> +	 */
> +	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> +
>  	ring->outstanding_lazy_request = req;
>  	return 0;
>  
> @@ -2674,6 +2704,13 @@ err:
>  	return ret;
>  }
>  
> +void i915_gem_request_cancel(struct drm_i915_gem_request *req)
> +{
> +	intel_ring_reserved_space_cancel(req->ringbuf);
> +
> +	i915_gem_request_unreference(req);
> +}
> +
>  struct drm_i915_gem_request *
>  i915_gem_find_active_request(struct intel_engine_cs *ring)
>  {
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6a5ed07..bd62bd6 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -690,6 +690,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
>  	if (intel_ring_space(ringbuf) >= bytes)
>  		return 0;
>  
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>  	list_for_each_entry(request, &ring->request_list, list) {
>  		/*
>  		 * The request queue is per-engine, so can contain requests
> @@ -748,8 +751,12 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>  	int rem = ringbuf->size - ringbuf->tail;
>  
>  	if (ringbuf->space < rem) {
> -		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
> +		int ret;
> +
> +		/* Can't wait if space has already been reserved! */
> +		WARN_ON(ringbuf->reserved_in_use);
>  
> +		ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>  		if (ret)
>  			return ret;
>  	}
> @@ -768,7 +775,7 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>  static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>  				struct intel_context *ctx, int bytes)
>  {
> -	int ret;
> +	int ret, max_bytes;
>  
>  	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>  		ret = logical_ring_wrap_buffer(ringbuf, ctx);
> @@ -776,8 +783,28 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>  			return ret;
>  	}
>  
> -	if (unlikely(ringbuf->space < bytes)) {
> -		ret = logical_ring_wait_for_space(ringbuf, ctx, bytes);
> +	/*
> +	 * Add on the reserved size to the request to make sure that after
> +	 * the intended commands have been emitted, there is guaranteed to
> +	 * still be enough free space to send them to the hardware.
> +	 */
> +	max_bytes = bytes + ringbuf->reserved_size;
> +
> +	if (unlikely(ringbuf->space < max_bytes)) {
> +		/*
> +		 * Bytes is guaranteed to fit within the tail of the buffer,
> +		 * but the reserved space may push it off the end. If so then
> +		 * need to wait for the whole of the tail plus the reserved
> +		 * size. That should guarantee that the actual request
> +		 * (bytes) will fit between here and the end and the reserved
> +		 * usage will fit either in the same or at the start. Either
> +		 * way, if a wrap occurs it will not involve a wait and thus
> +		 * cannot fail.
> +		 */
> +		if (unlikely(ringbuf->tail + max_bytes + I915_RING_FREE_SPACE > ringbuf->effective_size))
> +			max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail;
> +
> +		ret = logical_ring_wait_for_space(ringbuf, ctx, max_bytes);
>  		if (unlikely(ret))
>  			return ret;
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index d934f85..1c125e9 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2106,6 +2106,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>  	if (intel_ring_space(ringbuf) >= n)
>  		return 0;
>  
> +	/* The whole point of reserving space is to not wait! */
> +	WARN_ON(ringbuf->reserved_in_use);
> +
>  	list_for_each_entry(request, &ring->request_list, list) {
>  		space = __intel_ring_space(request->postfix, ringbuf->tail,
>  					   ringbuf->size);
> @@ -2131,7 +2134,12 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>  	int rem = ringbuf->size - ringbuf->tail;
>  
>  	if (ringbuf->space < rem) {
> -		int ret = ring_wait_for_space(ring, rem);
> +		int ret;
> +
> +		/* Can't wait if space has already been reserved! */
> +		WARN_ON(ringbuf->reserved_in_use);
> +
> +		ret = ring_wait_for_space(ring, rem);
>  		if (ret)
>  			return ret;
>  	}
> @@ -2180,11 +2188,69 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>  	return 0;
>  }
>  
> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
> -				int bytes)
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
> +{
> +	/* NB: Until request management is fully tidied up and the OLR is
> +	 * removed, there are too many ways for get false hits on this
> +	 * anti-recursion check! */
> +	/*WARN_ON(ringbuf->reserved_size);*/
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size = size;
> +
> +	/*
> +	 * Really need to call _begin() here but that currently leads to
> +	 * recursion problems! This will be fixed later but for now just
> +	 * return and hope for the best. Note that there is only a real
> +	 * problem if the create of the request never actually calls _begin()
> +	 * but if they are not submitting any work then why did they create
> +	 * the request in the first place?
> +	 */
> +}
> +
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(ringbuf->reserved_in_use);
> +
> +	ringbuf->reserved_in_use = true;
> +	ringbuf->reserved_tail   = ringbuf->tail;
> +}
> +
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
> +{
> +	WARN_ON(!ringbuf->reserved_in_use);
> +	if (ringbuf->tail > ringbuf->reserved_tail) {
> +		WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
> +		     "request reserved size too small: %d vs %d!\n",
> +		     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
> +	} else {
> +		/*
> +		 * The ring was wrapped while the reserved space was in use.
> +		 * That means that some unknown amount of the ring tail was
> +		 * no-op filled and skipped. Thus simply adding the ring size
> +		 * to the tail and doing the above space check will not work.
> +		 * Rather than attempt to track how much tail was skipped,
> +		 * it is much simpler to say that also skipping the sanity
> +		 * check every once in a while is not a big issue.
> +		 */
> +	}
> +
> +	ringbuf->reserved_size   = 0;
> +	ringbuf->reserved_in_use = false;
> +}
> +
> +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
>  {
>  	struct intel_ringbuffer *ringbuf = ring->buffer;
> -	int ret;
> +	int ret, max_bytes;
>  
>  	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>  		ret = intel_wrap_ring_buffer(ring);
> @@ -2192,8 +2258,28 @@ static int __intel_ring_prepare(struct intel_engine_cs *ring,
>  			return ret;
>  	}
>  
> -	if (unlikely(ringbuf->space < bytes)) {
> -		ret = ring_wait_for_space(ring, bytes);
> +	/*
> +	 * Add on the reserved size to the request to make sure that after
> +	 * the intended commands have been emitted, there is guaranteed to
> +	 * still be enough free space to send them to the hardware.
> +	 */
> +	max_bytes = bytes + ringbuf->reserved_size;
> +
> +	if (unlikely(ringbuf->space < max_bytes)) {
> +		/*
> +		 * Bytes is guaranteed to fit within the tail of the buffer,
> +		 * but the reserved space may push it off the end. If so then
> +		 * need to wait for the whole of the tail plus the reserved
> +		 * size. That should guarantee that the actual request
> +		 * (bytes) will fit between here and the end and the reserved
> +		 * usage will fit either in the same or at the start. Either
> +		 * way, if a wrap occurs it will not involve a wait and thus
> +		 * cannot fail.
> +		 */
> +		if (unlikely(ringbuf->tail + max_bytes > ringbuf->effective_size))
> +			max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail;
> +
> +		ret = ring_wait_for_space(ring, max_bytes);
>  		if (unlikely(ret))
>  			return ret;
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 39f6dfc..bf2ac28 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -105,6 +105,9 @@ struct intel_ringbuffer {
>  	int space;
>  	int size;
>  	int effective_size;
> +	int reserved_size;
> +	int reserved_tail;
> +	bool reserved_in_use;
>  
>  	/** We track the position of the requests in the ring buffer, and
>  	 * when each is retired we increment last_retired_head as the GPU
> @@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>  	return ring->outstanding_lazy_request;
>  }
>  
> +/*
> + * Arbitrary size for largest possible 'add request' sequence. The code paths
> + * are complex and variable. Empirical measurement shows that the worst case
> + * is ILK at 136 words. Reserving too much is better than reserving too little
> + * as that allows for corner cases that might have been missed. So the figure
> + * has been rounded up to 160 words.
> + */
> +#define MIN_SPACE_FOR_ADD_REQUEST	160
> +
> +/*
> + * Reserve space in the ring to guarantee that the i915_add_request() call
> + * will always have sufficient room to do its stuff. The request creation
> + * code calls this automatically.
> + */
> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
> +/* Cancel the reservation, e.g. because the request is being discarded. */
> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
> +/* Use the reserved space - for use by i915_add_request() only. */
> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
> +/* Finish with the reserved space - for use by i915_add_request() only. */
> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
> +
>  #endif /* _INTEL_RINGBUFFER_H_ */
> -- 
> 1.7.9.5
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch