[Intel-gfx] [PATCH 05/12] drm/i915/scheduler: Record all dependencies upon request construction

Thu Nov 3 11:55:38 UTC 2016

On Thu, Nov 03, 2016 at 11:03:47AM +0000, Tvrtko Ursulin wrote:
> 
> On 02/11/2016 17:50, Chris Wilson wrote:
> >The scheduler needs to know the dependencies of each request for the
> >lifetime of the request, as it may choose to reschedule the requests at
> >any time and must ensure the dependency tree is not broken. This is in
> >additional to using the fence to only allow execution after all
> >dependencies have been completed.
> >
> >One option was to extend the fence to support the bidirectional
> >dependency tracking required by the scheduler. However the mismatch in
> >lifetimes between the submit fence and the request essentially meant
> >that we had to build a completely separate struct (and we could not
> >simply reuse the existing waitqueue in the fence for one half of the
> >dependency tracking). The extra dependency tracking simply did not mesh
> >well with the fence, and keeping it separate both keeps the fence
> >implementation simpler and allows us to extend the dependency tracking
> >into a priority tree (whilst maintaining support for reordering the
> >tree).
> >
> >To avoid the additional allocations and list manipulations, the use of
> >the priotree is disabled when there are no schedulers to use it.
> >
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >---
> > drivers/gpu/drm/i915/i915_gem_request.c | 72 ++++++++++++++++++++++++++++++++-
> > drivers/gpu/drm/i915/i915_gem_request.h | 23 +++++++++++
> > 2 files changed, 94 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> >index 9c8605c834f9..13090f226203 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_request.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_request.c
> >@@ -113,6 +113,59 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
> > 	spin_unlock(&file_priv->mm.lock);
> > }
> >
> >+static int
> >+i915_priotree_add_dependency(struct i915_priotree *pt,
> >+			     struct i915_priotree *signal,
> >+			     struct i915_dependency *dep)
> >+{
> >+	unsigned long flags = 0;
> >+
> >+	if (!dep) {
> >+		dep = kmalloc(sizeof(*dep), GFP_KERNEL);
> 
> I will mention a dedicated cache again since this could possibly be
> our hottest allocation path. With a dedicated slab I've seen it grow
> to 5-7k objects in some benchmarks, with the request slab around 1k
> at the same time.

I'm open to one. We allocate more of these than we do even for fences. I
was thinking it could be added later, but if we can the api to always
pass in the i915_dependency it will probably work better.
> 
> >+		if (!dep)
> >+			return -ENOMEM;
> >+
> >+		flags |= I915_DEPENDENCY_ALLOC;
> >+	}
> 
> Not sure if it would be any nicer to just set the flags after
> allocating to I915_DEPENDENCY_ALLOC and add an else path to set it
> to zero here.

I just tend to avoid if {} else {} if I can help, just a personal
preference.

> >+struct i915_dependency {
> >+	struct i915_priotree *signal;
> >+	struct list_head pre_link, post_link;
> >+	unsigned long flags;
> >+#define I915_DEPENDENCY_ALLOC BIT(0)
> >+};
> >+
> >+struct i915_priotree {
> >+	struct list_head pre_list; /* who is before us, we depend upon */
> >+	struct list_head post_list; /* who is after us, they depend upon us */
> >+};
> 
> I need a picture to imagine this data structure. :(

The names suck.

\|||/
 \|/
  .
 /|\
/|||\

Googling bidirectional acyclic graph doesn't help that much.

> >+
> > /**
> >  * Request queue structure.
> >  *
> >@@ -89,6 +101,17 @@ struct drm_i915_gem_request {
> > 	wait_queue_t submitq;
> > 	wait_queue_t execq;
> >
> >+	/* A list of everyone we wait upon, and everyone who waits upon us.
> >+	 * Even though we will not be submitted to the hardware before the
> >+	 * submit fence is signaled (it waits for all external events as well
> >+	 * as our own requests), the scheduler still needs to know the
> >+	 * dependency tree for the lifetime of the request (from execbuf
> >+	 * to retirement), i.e. bidirectional dependency information for the
> >+	 * request not tied to individual fences.
> >+	 */
> >+	struct i915_priotree priotree;
> >+	struct i915_dependency depq;
> 
> Why depq and not dep_node or something? Makes me wonder what am I
> missing about the "q". It is not a queue?!?

Just a pattern was forming. It's just a much of a queue as our
wait_queue_t above :) Point taken.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre