[Intel-gfx] [PATCH v3] drm/i915: Emit to ringbuffer directly

Dave Gordon david.s.gordon at intel.com
Mon Sep 12 15:04:43 UTC 2016


On 12/09/16 10:44, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> This removes the usage of intel_ring_emit in favour of
> directly writing to the ring buffer.
>
> intel_ring_emit was preventing the compiler for optimising
> fetch and increment of the current ring buffer pointer and
> therefore generating very verbose code for every write.
>
> It had no useful purpose since all ringbuffer operations
> are started and ended with intel_ring_begin and
> intel_ring_advance respectively, with no bail out in the
> middle possible, so it is fine to increment the tail in
> intel_ring_begin and let the code manage the pointer
> itself.
>
> Useless instruction removal amounts to approximately
> two and half kilobytes of saved text on my build.
>
> Not sure if this has any measurable performance
> implications but executing a ton of useless instructions
> on fast paths cannot be good.
>
> Patch is not fully polished, but it compiles and runs
> on Gen9 at least.
>
> v2:
>  * Change return from intel_ring_begin to error pointer by
>    popular demand.
>  * Move tail increment to intel_ring_advance to enable some
>    error checking.
>
> v3:
>  * Move tail advance back into intel_ring_begin.

Downside of this is that you now have no check that _advance() is ever 
called, let alone called once per begin. If we want to strictly enforce 
the begin-advance pairing, _advance() has to do something, either 
something that is necessary for the next _begin(), or at least reset 
some state flag that tracks whether we're between a begin and an 
advance, or an advance and a begin (when writing to the ring is not 
allowed!).

.Dave.

>  * Rebase and tidy.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Dave Gordon <david.s.gordon at intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c    |  75 ++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  37 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  70 ++--
>  drivers/gpu/drm/i915/intel_display.c       | 132 +++---
>  drivers/gpu/drm/i915/intel_lrc.c           | 245 ++++++-----
>  drivers/gpu/drm/i915/intel_mocs.c          |  53 ++-
>  drivers/gpu/drm/i915/intel_overlay.c       |  88 ++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c    | 638 ++++++++++++++---------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    |  24 +-
>  9 files changed, 655 insertions(+), 707 deletions(-)
>
[snip]

> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 7f64d611159b..53dcd7b9a72d 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -456,30 +456,12 @@ void intel_legacy_submission_resume(struct drm_i915_private *dev_priv);
>
>  int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request);
>
> -int __must_check intel_ring_begin(struct drm_i915_gem_request *req, int n);
> +u32 __must_check *intel_ring_begin(struct drm_i915_gem_request *req, int n);
>  int __must_check intel_ring_cacheline_align(struct drm_i915_gem_request *req);
>
> -static inline void intel_ring_emit(struct intel_ring *ring, u32 data)
> +static inline void intel_ring_advance(struct intel_ring *ring, u32 *rbuf)
>  {
> -	*(uint32_t *)(ring->vaddr + ring->tail) = data;
> -	ring->tail += 4;
> -}
> -
> -static inline void intel_ring_emit_reg(struct intel_ring *ring, i915_reg_t reg)
> -{
> -	intel_ring_emit(ring, i915_mmio_reg_offset(reg));
> -}
> -
> -static inline void intel_ring_advance(struct intel_ring *ring)
> -{
> -	/* Dummy function.
> -	 *
> -	 * This serves as a placeholder in the code so that the reader
> -	 * can compare against the preceding intel_ring_begin() and
> -	 * check that the number of dwords emitted matches the space
> -	 * reserved for the command packet (i.e. the value passed to
> -	 * intel_ring_begin()).
> -	 */
> +	GEM_BUG_ON((ring->vaddr + ring->tail) != rbuf);
>  }
>
>  static inline u32 intel_ring_offset(struct intel_ring *ring, u32 value)
>



More information about the Intel-gfx mailing list