[Intel-gfx] [PATCH 3/4] drm/i915: Compute the ring->space slightly less pessimistically

Chris Wilson chris at chris-wilson.co.uk
Mon Apr 24 12:52:39 UTC 2017


On Mon, Apr 24, 2017 at 03:37:44PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > We only have to prevent the RING_TAIL from catching the RING_HEAD
> > cacheline and do not need to enforce a whole cacheline separation, and in
> > the process we can remove one branch from the computation.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 20 ++++++++++----------
> >  drivers/gpu/drm/i915/intel_ringbuffer.h | 24 +++++++++++++-----------
> >  2 files changed, 23 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 1ec98851a621..48d4b5b24b21 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -39,12 +39,15 @@
> >   */
> >  #define LEGACY_REQUEST_SIZE 200
> >  
> > -static int __intel_ring_space(int head, int tail, int size)
> > +static unsigned int __intel_ring_space(unsigned int head,
> > +				       unsigned int tail,
> > +				       unsigned int size)
> >  {
> > -	int space = head - tail;
> > -	if (space <= 0)
> > -		space += size;
> > -	return space - I915_RING_FREE_SPACE;
> > +	/* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
> > +	 * same cacheline, the Head Pointer must not be greater than the Tail
> > +	 * Pointer."
> > +	 */
> > +	return (round_down(head, CACHELINE_BYTES) - tail - 8) & (size - 1);
> >  }
> >  
> >  void intel_ring_update_space(struct intel_ring *ring)
> > @@ -1618,12 +1621,9 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> >  	GEM_BUG_ON(!req->reserved_space);
> >  
> >  	list_for_each_entry(target, &ring->request_list, ring_link) {
> > -		unsigned space;
> > -
> >  		/* Would completion of this request free enough space? */
> > -		space = __intel_ring_space(target->postfix, ring->emit,
> > -					   ring->size);
> > -		if (space >= bytes)
> > +		if (bytes <= __intel_ring_space(target->postfix,
> > +						ring->emit, ring->size))
> >  			break;
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> > index 31998ea46fb8..e28ffc98c520 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> > @@ -17,17 +17,6 @@
> >  #define CACHELINE_BYTES 64
> >  #define CACHELINE_DWORDS (CACHELINE_BYTES / sizeof(uint32_t))
> >  
> > -/*
> > - * Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use"
> > - * Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use"
> > - * Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring Buffer Use"
> > - *
> > - * "If the Ring Buffer Head Pointer and the Tail Pointer are on the same
> > - * cacheline, the Head Pointer must not be greater than the Tail
> > - * Pointer."
> > - */
> > -#define I915_RING_FREE_SPACE 64
> > -
> >  struct intel_hw_status_page {
> >  	struct i915_vma *vma;
> >  	u32 *page_addr;
> > @@ -545,6 +534,19 @@ assert_ring_tail_valid(const struct intel_ring *ring, unsigned int tail)
> >  	 */
> >  	GEM_BUG_ON(!IS_ALIGNED(tail, 8));
> >  	GEM_BUG_ON(tail >= ring->size);
> > +
> > +	/* "Ring Buffer Use"
> > +	 *	Gen2 BSpec "1. Programming Environment" / 1.4.4.6
> > +	 *	Gen3 BSpec "1c Memory Interface Functions" / 2.3.4.5
> > +	 *	Gen4+ BSpec "1c Memory Interface and Command Stream" / 5.3.4.5
> > +	 * "If the Ring Buffer Head Pointer and the Tail Pointer are on the
> > +	 * same cacheline, the Head Pointer must not be greater than the Tail
> > +	 * Pointer."
> > +	 */
> > +#define cacheline(a) round_down(a, CACHELINE_BYTES)
> > +	GEM_BUG_ON(cacheline(tail) == cacheline(ring->head) &&
> 
> We are about to set the hw tail. But we are using our
> outdated bookkeepping value of ring head to do the check.
> 
> I am confused. Is this something that can happen only when
> it wraps. And why we dont need to check the HW head?

We know that the real HEAD is at least ring->head. We know that when we
computed the space for the request, we used at most ring->head.
Therefore from our logic, we have treated ring->head == RING_HEAD and
should never violate the cacheline for our own bookkeeping. That it may
not result in the HW tripping up is not as important as confirming our
own logic to prevent the trip.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list