[Intel-gfx] [PATCH v3] drm/i915: Split execlist priority queue into rbtree + linked list

Mon May 15 14:51:59 UTC 2017

On Mon, May 15, 2017 at 02:29:50PM +0100, Chris Wilson wrote:
> All the requests at the same priority are executed in FIFO order. They
> do not need to be stored in the rbtree themselves, as they are a simple
> list within a level. If we move the requests at one priority into a list,
> we can then reduce the rbtree to the set of priorities. This should keep
> the height of the rbtree small, as the number of active priorities can not
> exceed the number of active requests and should be typically only a few.
> 
> Currently, we have ~2k possible different priority levels, that may
> increase to allow even more fine grained selection. Allocating those in
> advance seems a waste (and may be impossible), so we opt for allocating
> upon first use, and freeing after its requests are depleted. To avoid
> the possibility of an allocation failure causing us to lose a request,
> we preallocate the default priority (0) and bump any request to that
> priority if we fail to allocate it the appropriate plist. Having a
> request (that is ready to run, so not leading to corruption) execute
> out-of-order is better than leaking the request (and its dependency
> tree) entirely.
> 
> There should be a benefit to reducing execlists_dequeue() to principally
> using a simple list (and reducing the frequency of both rbtree iteration
> and balancing on erase) but for typical workloads, request coalescing
> should be small enough that we don't notice any change. The main gain is
> from improving PI calls to schedule, and the explicit list within a
> level should make request unwinding simpler (we just need to insert at
> the head of the list rather than the tail and not have to make the
> rbtree search more complicated).
> 
> v2: Avoid use-after-free when deleting a depleted priolist
> 
> v3: Michał found the solution to handling the allocation failure
> gracefully. If we disable all priority scheduling following the
> allocation failure, those requests will be executed in fifo and we will
> ensure that this request and its dependencies are in strict fifo (even
> when it doesn't realise it is only a single list). Normal scheduling is
> restored once we know the device is idle, until the next failure!

Actually, this is a result of a conversation with the other Michał W, so:
Suggested-by: Michał Wajdeczko <michal.wajdeczko at intel.com>

also
Reviewed-by: Michał Winiarski <michal.winiarski at intel.com>

Thanks!

-Michał

> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  11 +-
>  drivers/gpu/drm/i915/i915_gem.c            |   7 +-
>  drivers/gpu/drm/i915/i915_gem_request.c    |   4 +-
>  drivers/gpu/drm/i915/i915_gem_request.h    |   2 +-
>  drivers/gpu/drm/i915/i915_guc_submission.c |  50 ++++----
>  drivers/gpu/drm/i915/i915_utils.h          |   9 ++
>  drivers/gpu/drm/i915/intel_engine_cs.c     |  12 ++
>  drivers/gpu/drm/i915/intel_lrc.c           | 184 +++++++++++++++++++----------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    |   9 ++
>  9 files changed, 193 insertions(+), 95 deletions(-)