[Intel-gfx] [RFC 22/21] drm/i915: Cache request completion status

Mon Nov 3 11:57:32 CET 2014

On Tue, Oct 28, 2014 at 03:36:29PM +0000, John Harrison wrote:
> On 19/10/2014 15:14, Daniel Vetter wrote:
> >On Tue, Oct 07, 2014 at 05:47:29PM +0100, John.C.Harrison at Intel.com wrote:
> >>From: John Harrison <John.C.Harrison at Intel.com>
> >>
> >>For: VIZ-4377
> >>Signed-off-by: John.C.Harrison at Intel.com
> >Why? If it's just for performance I think we should do this as part of the
> >switch to struct fence, which already has this.
> For performance and also as part of getting rid of all the
> i915_seqno_passed() calls.

The why was more a request for a proper commit message. If you claim
perfomance, then the commit message also needs the relevant data.

> >>---
> >>  drivers/gpu/drm/i915/i915_drv.h         |   34 ++++++++++++++++---------------
> >>  drivers/gpu/drm/i915/i915_gem.c         |   21 +++++++++++++++++++
> >>  drivers/gpu/drm/i915/intel_lrc.c        |    1 +
> >>  drivers/gpu/drm/i915/intel_ringbuffer.c |    2 ++
> >>  drivers/gpu/drm/i915/intel_ringbuffer.h |    3 +++
> >>  5 files changed, 45 insertions(+), 16 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index cdbbdeb..4ab3b23 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -1913,6 +1913,9 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
> >>  struct drm_i915_gem_request {
> >>  	struct kref ref;
> >>+	/** Is this request known to be complete? */
> >>+	bool complete;
> >>+
> >>  	/** On Which ring this request was generated */
> >>  	struct intel_engine_cs *ring;
> >>@@ -1943,6 +1946,8 @@ struct drm_i915_gem_request {
> >>  };
> >>  void i915_gem_request_free(struct kref *req_ref);
> >>+void i915_gem_complete_requests_ring(struct intel_engine_cs *ring,
> >>+				     bool lazy_coherency);
> >>  static inline uint32_t
> >>  i915_gem_request_get_seqno(struct drm_i915_gem_request *req)
> >>@@ -1968,7 +1973,19 @@ i915_gem_request_unreference(struct drm_i915_gem_request *req)
> >>  	kref_put(&req->ref, i915_gem_request_free);
> >>  }
> >>-/* ??? i915_gem_request_completed should be here ??? */
> >>+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
> >>+					      bool lazy_coherency)
> >>+{
> >>+	if (req->complete)
> >>+		return true;
> >>+
> >>+	if (req->ring == NULL)
> >>+		return false;
> >>+
> >>+	i915_gem_complete_requests_ring(req->ring, lazy_coherency);
> >>+
> >>+	return req->complete;
> >>+}
> >Also, this is looking way too big now I think ;-) If you have a full
> >non-inline function call in your inline it's always a net loss.
> >-Daniel
> That depends how you define gain/loss. In terms of performance, it can still
> be a gain because the function call is not always taken. Whereas the
> alternative is at least one function calls and possibly two. Either way, as
> noted already, the final intention is for this to become simply 'return
> req->complete' and not have any function calls at all.

Thrashing instructions caches is usually what this does, so even when you
microbenchmark a gain it might be a net loss. Rule-of-thumb is static
inline is ok if it results in a net reduction of the binary, not for any
other case.

Unless you supply solid performance data justifying it.

Also, static inlines in random headers are a pain for documentation.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch