[Intel-gfx] [PATCH 04/22] drm/i915: Remove request retirement before each batch
Chris Wilson
chris at chris-wilson.co.uk
Thu Jul 28 09:32:23 UTC 2016
On Thu, Jul 28, 2016 at 11:32:47AM +0300, Joonas Lahtinen wrote:
> On ke, 2016-07-27 at 12:14 +0100, Chris Wilson wrote:
> > This reimplements the denial-of-service protection against igt from
> > commit 227f782e4667 ("drm/i915: Retire requests before creating a new
> > one") and transfers the stall from before each batch into get_pages().
> > The issue is that the stall is increasing latency between batches which
> > is detrimental in some cases (especially coupled with execlists) to
> > keeping the GPU well fed. Also we have made the observation that retiring
> > requests can of itself free objects (and requests) and therefore makes
> > a good first step when shrinking.
> >
> > v2: Recycle objects prior to i915_gem_object_get_pages()
> > v3: Remove the reference to the ring from i915_gem_requests_ring() as it
> > operates on an intel_engine_cs.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>
> Was this tested for performance regressions?
Yes. It fixed the latency issue from 227f82e4667, introduced an issue
with page allocation for context/object creation which was papered over
in v2. Since then requests (this series+) have become both more lazy and
more economical changing the latency characteristics for execbuf, which
mitigates somewhat the issue found in v1. Thankfully since when we
implement a separate mm lock, the freedom to do a full retirement before
get_pages() is lost.
This series is intending (including the execbuf reworking) to fix the
2x-10x performance regression (platform dependent) we have in
microbenchmarks (which corresponds to about 20% at the GL level in driver
stress tests). However, execlists still remains ~8% slower than legacy
submission (at the GL level).
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list