[Intel-gfx] [RFC 22/21] drm/i915: Cache request completion status
Daniel Vetter
daniel at ffwll.ch
Mon Nov 3 11:57:32 CET 2014
On Tue, Oct 28, 2014 at 03:36:29PM +0000, John Harrison wrote:
> On 19/10/2014 15:14, Daniel Vetter wrote:
> >On Tue, Oct 07, 2014 at 05:47:29PM +0100, John.C.Harrison at Intel.com wrote:
> >>From: John Harrison <John.C.Harrison at Intel.com>
> >>
> >>For: VIZ-4377
> >>Signed-off-by: John.C.Harrison at Intel.com
> >Why? If it's just for performance I think we should do this as part of the
> >switch to struct fence, which already has this.
> For performance and also as part of getting rid of all the
> i915_seqno_passed() calls.
The why was more a request for a proper commit message. If you claim
perfomance, then the commit message also needs the relevant data.
> >>---
> >> drivers/gpu/drm/i915/i915_drv.h | 34 ++++++++++++++++---------------
> >> drivers/gpu/drm/i915/i915_gem.c | 21 +++++++++++++++++++
> >> drivers/gpu/drm/i915/intel_lrc.c | 1 +
> >> drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
> >> drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +++
> >> 5 files changed, 45 insertions(+), 16 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index cdbbdeb..4ab3b23 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -1913,6 +1913,9 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
> >> struct drm_i915_gem_request {
> >> struct kref ref;
> >>+ /** Is this request known to be complete? */
> >>+ bool complete;
> >>+
> >> /** On Which ring this request was generated */
> >> struct intel_engine_cs *ring;
> >>@@ -1943,6 +1946,8 @@ struct drm_i915_gem_request {
> >> };
> >> void i915_gem_request_free(struct kref *req_ref);
> >>+void i915_gem_complete_requests_ring(struct intel_engine_cs *ring,
> >>+ bool lazy_coherency);
> >> static inline uint32_t
> >> i915_gem_request_get_seqno(struct drm_i915_gem_request *req)
> >>@@ -1968,7 +1973,19 @@ i915_gem_request_unreference(struct drm_i915_gem_request *req)
> >> kref_put(&req->ref, i915_gem_request_free);
> >> }
> >>-/* ??? i915_gem_request_completed should be here ??? */
> >>+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
> >>+ bool lazy_coherency)
> >>+{
> >>+ if (req->complete)
> >>+ return true;
> >>+
> >>+ if (req->ring == NULL)
> >>+ return false;
> >>+
> >>+ i915_gem_complete_requests_ring(req->ring, lazy_coherency);
> >>+
> >>+ return req->complete;
> >>+}
> >Also, this is looking way too big now I think ;-) If you have a full
> >non-inline function call in your inline it's always a net loss.
> >-Daniel
> That depends how you define gain/loss. In terms of performance, it can still
> be a gain because the function call is not always taken. Whereas the
> alternative is at least one function calls and possibly two. Either way, as
> noted already, the final intention is for this to become simply 'return
> req->complete' and not have any function calls at all.
Thrashing instructions caches is usually what this does, so even when you
microbenchmark a gain it might be a net loss. Rule-of-thumb is static
inline is ok if it results in a net reduction of the binary, not for any
other case.
Unless you supply solid performance data justifying it.
Also, static inlines in random headers are a pain for documentation.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list