[Intel-gfx] [PATCH 06/16] drm/i915: Implement inter-engine read-read optimisations

Wed Apr 29 06:51:23 PDT 2015

On 04/27/2015 01:41 PM, Chris Wilson wrote:
> Currently, we only track the last request globally across all engines.
> This prevents us from issuing concurrent read requests on e.g. the RCS
> and BCS engines (or more likely the render and media engines). Without
> semaphores, we incur costly stalls as we synchronise between rings -
> greatly impacting the current performance of Broadwell versus Haswell in
> certain workloads (like video decode). With the introduction of
> reference counted requests, it is much easier to track the last request
> per ring, as well as the last global write request so that we can
> optimise inter-engine read read requests (as well as better optimise
> certain CPU waits).
>
> v2: Fix inverted readonly condition for nonblocking waits.
> v3: Handle non-continguous engine array after waits
> v4: Rebase, tidy, rewrite ring list debugging
> v5: Use obj->active as a bitfield, it looks cool
> v6: Micro-optimise, mostly involving moving code around
> v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf)
> v8: Rebase
> v9: Refactor i915_gem_object_sync() to allow the compiler to better
> optimise it.

Looks OK, you can upgrade my r-b to v9.

Regards,

Tvrtko