[Intel-gfx] [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands

Thu Jun 18 03:43:22 PDT 2015

On 17/06/2015 15:04, Daniel Vetter wrote:
> On Thu, Jun 04, 2015 at 01:06:34PM +0100, John.C.Harrison at Intel.com wrote:
>> From: John Harrison <John.C.Harrison at Intel.com>
>>
>> It is a bad idea for i915_add_request() to fail. The work will already have been
>> send to the ring and will be processed, but there will not be any tracking or
>> management of that work.
>>
>> The only way the add request call can fail is if it can't write its epilogue
>> commands to the ring (cache flushing, seqno updates, interrupt signalling). The
>> reasons for that are mostly down to running out of ring buffer space and the
>> problems associated with trying to get some more. This patch prevents that
>> situation from happening in the first place.
>>
>> When a request is created, it marks sufficient space as reserved for the
>> epilogue commands. Thus guaranteeing that by the time the epilogue is written,
>> there will be plenty of space for it. Note that a ring_begin() call is required
>> to actually reserve the space (and do any potential waiting). However, that is
>> not currently done at request creation time. This is because the ring_begin()
>> code can allocate a request. Hence calling begin() from the request allocation
>> code would lead to infinite recursion! Later patches in this series remove the
>> need for begin() to do the allocate. At that point, it becomes safe for the
>> allocate to call begin() and really reserve the space.
>>
>> Until then, there is a potential for insufficient space to be available at the
>> point of calling i915_add_request(). However, that would only be in the case
>> where the request was created and immediately submitted without ever calling
>> ring_begin() and adding any work to that request. Which should never happen. And
>> even if it does, and if that request happens to fall down the tiny window of
>> opportunity for failing due to being out of ring space then does it really
>> matter because the request wasn't doing anything in the first place?
>>
>> v2: Updated the 'reserved space too small' warning to include the offending
>> sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
>> re-initialisation of tracking state after a buffer wrap to keep the sanity
>> checks accurate.
>>
>> v3: Incremented the reserved size to accommodate Ironlake (after finally
>> managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
>>
>> For: VIZ-5115
>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>  From the last review round there's still my question wrt the correctness
> of the reservation overflow vs. wrapping outstanding:
>
> http://article.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/56575

v2 - 'added re-intialisation of tracking state after a buffer wrap to 
keep the sanity checks accurate'. Does that not address your issue with 
wrapping?

> Also when resending patches, especially after such a long delay please
> leave some indication of what you've decided to do wrt review comments.
> Either as a reply in the review discussion (preferred) or at least as an
> update in the cover letter or per-patch changelog. Otherwise reviewers
> need to reverse-engineer what you have or haven't done by diffing patches,
> which is just not that efficient.
> -Daniel
>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         |    1 +
>>   drivers/gpu/drm/i915/i915_gem.c         |   37 +++++++++++++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        |   18 ++++++++
>>   drivers/gpu/drm/i915/intel_ringbuffer.c |   68 ++++++++++++++++++++++++++++++-
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |   25 ++++++++++++
>>   5 files changed, 147 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index e9d76f3..44dee31 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request {
>>   
>>   int i915_gem_request_alloc(struct intel_engine_cs *ring,
>>   			   struct intel_context *ctx);
>> +void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>   void i915_gem_request_free(struct kref *req_ref);
>>   
>>   static inline uint32_t
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 78f6a89..516e9b7 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2495,6 +2495,13 @@ int __i915_add_request(struct intel_engine_cs *ring,
>>   	} else
>>   		ringbuf = ring->buffer;
>>   
>> +	/*
>> +	 * To ensure that this call will not fail, space for its emissions
>> +	 * should already have been reserved in the ring buffer. Let the ring
>> +	 * know that it is time to use that space up.
>> +	 */
>> +	intel_ring_reserved_space_use(ringbuf);
>> +
>>   	request_start = intel_ring_get_tail(ringbuf);
>>   	/*
>>   	 * Emit any outstanding flushes - execbuf can fail to emit the flush
>> @@ -2577,6 +2584,9 @@ int __i915_add_request(struct intel_engine_cs *ring,
>>   			   round_jiffies_up_relative(HZ));
>>   	intel_mark_busy(dev_priv->dev);
>>   
>> +	/* Sanity check that the reserved size was large enough. */
>> +	intel_ring_reserved_space_end(ringbuf);
>> +
>>   	return 0;
>>   }
>>   
>> @@ -2676,6 +2686,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>>   	if (ret)
>>   		goto err;
>>   
>> +	/*
>> +	 * Reserve space in the ring buffer for all the commands required to
>> +	 * eventually emit this request. This is to guarantee that the
>> +	 * i915_add_request() call can't fail. Note that the reserve may need
>> +	 * to be redone if the request is not actually submitted straight
>> +	 * away, e.g. because a GPU scheduler has deferred it.
>> +	 *
>> +	 * Note further that this call merely notes the reserve request. A
>> +	 * subsequent call to *_ring_begin() is required to actually ensure
>> +	 * that the reservation is available. Without the begin, if the
>> +	 * request creator immediately submitted the request without adding
>> +	 * any commands to it then there might not actually be sufficient
>> +	 * room for the submission commands. Unfortunately, the current
>> +	 * *_ring_begin() implementations potentially call back here to
>> +	 * i915_gem_request_alloc(). Thus calling _begin() here would lead to
>> +	 * infinite recursion! Until that back call path is removed, it is
>> +	 * necessary to do a manual _begin() outside.
>> +	 */
>> +	intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
>> +
>>   	ring->outstanding_lazy_request = req;
>>   	return 0;
>>   
>> @@ -2684,6 +2714,13 @@ err:
>>   	return ret;
>>   }
>>   
>> +void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>> +{
>> +	intel_ring_reserved_space_cancel(req->ringbuf);
>> +
>> +	i915_gem_request_unreference(req);
>> +}
>> +
>>   struct drm_i915_gem_request *
>>   i915_gem_find_active_request(struct intel_engine_cs *ring)
>>   {
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 6a5ed07..42a756d 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -687,6 +687,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf,
>>   	unsigned space;
>>   	int ret;
>>   
>> +	/* The whole point of reserving space is to not wait! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	if (intel_ring_space(ringbuf) >= bytes)
>>   		return 0;
>>   
>> @@ -747,6 +750,9 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf,
>>   	uint32_t __iomem *virt;
>>   	int rem = ringbuf->size - ringbuf->tail;
>>   
>> +	/* Can't wrap if space has already been reserved! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	if (ringbuf->space < rem) {
>>   		int ret = logical_ring_wait_for_space(ringbuf, ctx, rem);
>>   
>> @@ -770,10 +776,22 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf,
>>   {
>>   	int ret;
>>   
>> +	if (!ringbuf->reserved_in_use)
>> +		bytes += ringbuf->reserved_size;
>> +
>>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>> +		WARN_ON(ringbuf->reserved_in_use);
>> +
>>   		ret = logical_ring_wrap_buffer(ringbuf, ctx);
>>   		if (unlikely(ret))
>>   			return ret;
>> +
>> +		if(ringbuf->reserved_size) {
>> +			uint32_t size = ringbuf->reserved_size;
>> +
>> +			intel_ring_reserved_space_cancel(ringbuf);
>> +			intel_ring_reserved_space_reserve(ringbuf, size);
>> +		}
>>   	}
>>   
>>   	if (unlikely(ringbuf->space < bytes)) {
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index d934f85..74c2222 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2103,6 +2103,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>>   	unsigned space;
>>   	int ret;
>>   
>> +	/* The whole point of reserving space is to not wait! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	if (intel_ring_space(ringbuf) >= n)
>>   		return 0;
>>   
>> @@ -2130,6 +2133,9 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>>   	int rem = ringbuf->size - ringbuf->tail;
>>   
>> +	/* Can't wrap if space has already been reserved! */
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>>   	if (ringbuf->space < rem) {
>>   		int ret = ring_wait_for_space(ring, rem);
>>   		if (ret)
>> @@ -2180,16 +2186,74 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
>>   	return 0;
>>   }
>>   
>> -static int __intel_ring_prepare(struct intel_engine_cs *ring,
>> -				int bytes)
>> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
>> +{
>> +	/* NB: Until request management is fully tidied up and the OLR is
>> +	 * removed, there are too many ways for get false hits on this
>> +	 * anti-recursion check! */
>> +	/*WARN_ON(ringbuf->reserved_size);*/
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_size = size;
>> +
>> +	/*
>> +	 * Really need to call _begin() here but that currently leads to
>> +	 * recursion problems! This will be fixed later but for now just
>> +	 * return and hope for the best. Note that there is only a real
>> +	 * problem if the create of the request never actually calls _begin()
>> +	 * but if they are not submitting any work then why did they create
>> +	 * the request in the first place?
>> +	 */
>> +}
>> +
>> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_size   = 0;
>> +	ringbuf->reserved_in_use = false;
>> +}
>> +
>> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(ringbuf->reserved_in_use);
>> +
>> +	ringbuf->reserved_in_use = true;
>> +	ringbuf->reserved_tail   = ringbuf->tail;
>> +}
>> +
>> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
>> +{
>> +	WARN_ON(!ringbuf->reserved_in_use);
>> +	WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size,
>> +	     "request reserved size too small: %d vs %d!\n",
>> +	     ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size);
>> +
>> +	ringbuf->reserved_size   = 0;
>> +	ringbuf->reserved_in_use = false;
>> +}
>> +
>> +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes)
>>   {
>>   	struct intel_ringbuffer *ringbuf = ring->buffer;
>>   	int ret;
>>   
>> +	if (!ringbuf->reserved_in_use)
>> +		bytes += ringbuf->reserved_size;
>> +
>>   	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>> +		WARN_ON(ringbuf->reserved_in_use);
>> +
>>   		ret = intel_wrap_ring_buffer(ring);
>>   		if (unlikely(ret))
>>   			return ret;
>> +
>> +		if(ringbuf->reserved_size) {
>> +			uint32_t size = ringbuf->reserved_size;
>> +
>> +			intel_ring_reserved_space_cancel(ringbuf);
>> +			intel_ring_reserved_space_reserve(ringbuf, size);
>> +		}
>>   	}
>>   
>>   	if (unlikely(ringbuf->space < bytes)) {
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index 39f6dfc..bf2ac28 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -105,6 +105,9 @@ struct intel_ringbuffer {
>>   	int space;
>>   	int size;
>>   	int effective_size;
>> +	int reserved_size;
>> +	int reserved_tail;
>> +	bool reserved_in_use;
>>   
>>   	/** We track the position of the requests in the ring buffer, and
>>   	 * when each is retired we increment last_retired_head as the GPU
>> @@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring)
>>   	return ring->outstanding_lazy_request;
>>   }
>>   
>> +/*
>> + * Arbitrary size for largest possible 'add request' sequence. The code paths
>> + * are complex and variable. Empirical measurement shows that the worst case
>> + * is ILK at 136 words. Reserving too much is better than reserving too little
>> + * as that allows for corner cases that might have been missed. So the figure
>> + * has been rounded up to 160 words.
>> + */
>> +#define MIN_SPACE_FOR_ADD_REQUEST	160
>> +
>> +/*
>> + * Reserve space in the ring to guarantee that the i915_add_request() call
>> + * will always have sufficient room to do its stuff. The request creation
>> + * code calls this automatically.
>> + */
>> +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size);
>> +/* Cancel the reservation, e.g. because the request is being discarded. */
>> +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf);
>> +/* Use the reserved space - for use by i915_add_request() only. */
>> +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf);
>> +/* Finish with the reserved space - for use by i915_add_request() only. */
>> +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
>> +
>>   #endif /* _INTEL_RINGBUFFER_H_ */
>> -- 
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx