[Intel-gfx] [PATCH v6 06/34] drm/i915: Start of GPU scheduler
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri Jun 10 16:24:08 UTC 2016
Hi,
Just a few random comments/questions. (not a full review!)
On 20/04/16 18:13, John.C.Harrison at Intel.com wrote:
> From: John Harrison <John.C.Harrison at Intel.com>
>
> Initial creation of scheduler source files. Note that this patch
> implements most of the scheduler functionality but does not hook it in
> to the driver yet. It also leaves the scheduler code in 'pass through'
> mode so that even when it is hooked in, it will not actually do very
> much. This allows the hooks to be added one at a time in bite size
> chunks and only when the scheduler is finally enabled at the end does
> anything start happening.
>
> The general theory of operation is that when batch buffers are
> submitted to the driver, the execbuffer() code packages up all the
> information required to execute the batch buffer at a later time. This
> package is given over to the scheduler which adds it to an internal
> node list. The scheduler also scans the list of objects associated
> with the batch buffer and compares them against the objects already in
> use by other buffers in the node list. If matches are found then the
> new batch buffer node is marked as being dependent upon the matching
> node. The same is done for the context object. The scheduler also
> bumps up the priority of such matching nodes on the grounds that the
> more dependencies a given batch buffer has the more important it is
> likely to be.
>
> The scheduler aims to have a given (tuneable) number of batch buffers
> in flight on the hardware at any given time. If fewer than this are
> currently executing when a new node is queued, then the node is passed
> straight through to the submit function. Otherwise it is simply added
> to the queue and the driver returns back to user land.
>
> The scheduler is notified when each batch buffer completes and updates
> its internal tracking accordingly. At the end of the completion
> interrupt processing, if any scheduler tracked batches were processed,
> the scheduler's deferred worker thread is woken up. This can do more
> involved processing such as actually removing completed nodes from the
> queue and freeing up the resources associated with them (internal
> memory allocations, DRM object references, context reference, etc.).
> The work handler also checks the in flight count and calls the
> submission code if a new slot has appeared.
>
> When the scheduler's submit code is called, it scans the queued node
> list for the highest priority node that has no unmet dependencies.
> Note that the dependency calculation is complex as it must take
> inter-ring dependencies and potential preemptions into account. Note
> also that in the future this will be extended to include external
> dependencies such as the Android Native Sync file descriptors and/or
> the linux dma-buff synchronisation scheme.
>
> If a suitable node is found then it is sent to execbuff_final() for
> submission to the hardware. The in flight count is then re-checked and
> a new node popped from the list if appropriate. All nodes that are not
> submitted have their priority bumped. This ensures that low priority
> tasks do not get starved out by busy higher priority ones - everything
> will eventually get its turn to run.
>
> Note that this patch does not implement pre-emptive scheduling. Only
> basic scheduling by re-ordering batch buffer submission is currently
> implemented. Pre-emption of actively executing batch buffers comes in
> the next patch series.
>
> v2: Changed priority levels to +/-1023 due to feedback from Chris
> Wilson.
>
> Removed redundant index from scheduler node.
>
> Changed time stamps to use jiffies instead of raw monotonic. This
> provides lower resolution but improved compatibility with other i915
> code.
>
> Major re-write of completion tracking code due to struct fence
> conversion. The scheduler no longer has it's own private IRQ handler
> but just lets the existing request code handle completion events.
> Instead, the scheduler now hooks into the request notify code to be
> told when a request has completed.
>
> Reduced driver mutex locking scope. Removal of scheduler nodes no
> longer grabs the mutex lock.
>
> v3: Refactor of dependency generation to make the code more readable.
> Also added in read-read optimisation support - i.e., don't treat a
> shared read-only buffer as being a dependency.
>
> Allowed the killing of queued nodes rather than only flying ones.
>
> v4: Updated the commit message to better reflect the current state of
> the code. Downgraded some BUG_ONs to WARN_ONs. Used the correct array
> memory allocator function (kmalloc_array instead of kmalloc).
> Corrected the format of some comments. Wrapped some lines differently
> to keep the style checker happy.
>
> Fixed a WARN_ON when killing nodes. The dependency removal code checks
> that nodes being destroyed do not have any oustanding dependencies
> (which would imply they should not have been executed yet). In the
> case of nodes being destroyed, e.g. due to context banning, then this
> might well be the case - they have not been executed and do indeed
> have outstanding dependencies.
>
> Re-instated the code to disble interrupts when not in use. The
> underlying problem causing broken IRQ reference counts seems to have
> been fixed now.
>
> v5: Shuffled various functions around to remove forward declarations
> as apparently these are frowned upon. Removed lots of white space as
> apparently having easy to read code is also frowned upon. Split the
> direct submission scheduler bypass code out into a separate function.
> Squashed down the i915_scheduler.c sections of various patches into
> this patch. Thus the later patches simply hook in existing code into
> various parts of the driver rather than adding the code as well. Added
> documentation to various functions. Re-worked the submit function in
> terms of mutex locking, error handling and exit paths. Split the
> delayed work handler function in half. Made use of the kernel 'clamp'
> macro. [Joonas Lahtinen]
>
> Added runtime PM calls as these must be done at the top level before
> acquiring the driver mutex lock. [Chris Wilson]
>
> Removed some obsolete debug code that had been forgotten about.
>
> Moved more clean up code into the 'i915_gem_scheduler_clean_node()'
> function rather than replicating it in mutliple places.
>
> Used lighter weight spinlocks.
>
> v6: Updated to newer nightly (lots of ring -> engine renaming).
>
> Added 'for_each_scheduler_node()' and 'assert_scheduler_lock_held()'
> helper macros. Renamed 'i915_gem_execbuff_release_batch_obj' to
> 'i915_gem_execbuf_release_batch_obj'. Updated to use 'to_i915()'
> instead of dev_private. Converted all enum labels to uppercase.
> Removed various unnecessary WARNs. Renamed 'saved_objects' to just
> 'objs'. Split code for counting incomplete nodes out into a separate
> function. Removed even more white space. Added a destroy() function.
> [review feedback from Joonas Lahtinen]
>
> Added running totals of 'flying' and 'queued' nodes rather than
> re-calculating each time as a minor CPU performance optimisation.
>
> Removed support for out of order seqno completion. All the prep work
> patch series (seqno to request conversion, late seqno assignment,
> etc.) that has now been done means that the scheduler no longer
> generates out of order seqno completions. Thus all the complex code
> for coping with such is no longer required and can be removed.
>
> Fixed a bug in scheduler bypass mode introduced in the clean up code
> refactoring of v5. The clean up function was seeing the node in the
> wrong state and thus refusing to process it.
>
> For: VIZ-1587
> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> ---
> drivers/gpu/drm/i915/Makefile | 1 +
> drivers/gpu/drm/i915/i915_dma.c | 3 +
> drivers/gpu/drm/i915/i915_drv.h | 6 +
> drivers/gpu/drm/i915/i915_gem.c | 5 +
> drivers/gpu/drm/i915/i915_scheduler.c | 867 ++++++++++++++++++++++++++++++++++
> drivers/gpu/drm/i915/i915_scheduler.h | 113 +++++
> 6 files changed, 995 insertions(+)
> create mode 100644 drivers/gpu/drm/i915/i915_scheduler.c
> create mode 100644 drivers/gpu/drm/i915/i915_scheduler.h
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index e9cdeb5..289fa73 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -10,6 +10,7 @@ ccflags-y := -Werror
> i915-y := i915_drv.o \
> i915_irq.o \
> i915_params.o \
> + i915_scheduler.o \
> i915_suspend.o \
> i915_sysfs.o \
> intel_csr.o \
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index b377753..2ad4071 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -37,6 +37,7 @@
> #include "i915_drv.h"
> #include "i915_vgpu.h"
> #include "i915_trace.h"
> +#include "i915_scheduler.h"
> #include <linux/pci.h>
> #include <linux/console.h>
> #include <linux/vt.h>
> @@ -1448,6 +1449,8 @@ int i915_driver_unload(struct drm_device *dev)
>
> intel_csr_ucode_fini(dev_priv);
>
> + i915_scheduler_destroy(dev_priv);
> +
> /* Free error state after interrupts are fully disabled. */
> cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
> i915_destroy_error_state(dev);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7492ce7..7b62e2c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1717,6 +1717,8 @@ struct i915_execbuffer_params {
> struct drm_i915_gem_request *request;
> };
>
> +struct i915_scheduler;
> +
> /* used in computing the new watermarks state */
> struct intel_wm_config {
> unsigned int num_pipes_active;
> @@ -1994,6 +1996,8 @@ struct drm_i915_private {
>
> struct i915_runtime_pm pm;
>
> + struct i915_scheduler *scheduler;
> +
> /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
> struct {
> int (*execbuf_submit)(struct i915_execbuffer_params *params,
> @@ -2335,6 +2339,8 @@ struct drm_i915_gem_request {
> /** process identifier submitting this request */
> struct pid *pid;
>
> + struct i915_scheduler_queue_entry *scheduler_qe;
> +
> /**
> * The ELSP only accepts two elements at a time, so we queue
> * context/tail pairs on a given queue (ring->execlist_queue) until the
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a632276..b7466cb 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -32,6 +32,7 @@
> #include "i915_vgpu.h"
> #include "i915_trace.h"
> #include "intel_drv.h"
> +#include "i915_scheduler.h"
> #include <linux/shmem_fs.h>
> #include <linux/slab.h>
> #include <linux/swap.h>
> @@ -5405,6 +5406,10 @@ int i915_gem_init(struct drm_device *dev)
> */
> intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
>
> + ret = i915_scheduler_init(dev);
> + if (ret)
> + goto out_unlock;
> +
> ret = i915_gem_init_userptr(dev);
> if (ret)
> goto out_unlock;
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> new file mode 100644
> index 0000000..9d628b9
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -0,0 +1,867 @@
> +/*
> + * Copyright (c) 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include "i915_drv.h"
> +#include "intel_drv.h"
> +#include "i915_scheduler.h"
> +
> +#define for_each_scheduler_node(node, id) \
> + list_for_each_entry((node), &scheduler->node_queue[(id)], link)
> +
> +#define assert_scheduler_lock_held(scheduler) \
> + do { \
> + WARN_ONCE(!spin_is_locked(&(scheduler)->lock), "Spinlock not locked!"); \
> + } while(0)
> +
> +/**
> + * i915_scheduler_is_enabled - Returns true if the scheduler is enabled.
> + * @dev: DRM device
> + */
> +bool i915_scheduler_is_enabled(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = to_i915(dev);
> +
> + return dev_priv->scheduler != NULL;
> +}
> +
> +/**
> + * i915_scheduler_init - Initialise the scheduler.
> + * @dev: DRM device
> + * Returns zero on success or -ENOMEM if memory allocations fail.
> + */
> +int i915_scheduler_init(struct drm_device *dev)
> +{
> + struct drm_i915_private *dev_priv = to_i915(dev);
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + int e;
> +
> + if (scheduler)
> + return 0;
Probably a GEM_BUG_ON or something.
> +
> + scheduler = kzalloc(sizeof(*scheduler), GFP_KERNEL);
> + if (!scheduler)
> + return -ENOMEM;
> +
> + spin_lock_init(&scheduler->lock);
> +
> + for (e = 0; e < I915_NUM_ENGINES; e++) {
> + INIT_LIST_HEAD(&scheduler->node_queue[e]);
> + scheduler->counts[e].flying = 0;
> + scheduler->counts[e].queued = 0;
> + }
> +
> + /* Default tuning values: */
> + scheduler->priority_level_min = -1023;
> + scheduler->priority_level_max = 1023;
> + scheduler->priority_level_bump = 50;
> + scheduler->priority_level_preempt = 900;
> + scheduler->min_flying = 2;
> +
> + dev_priv->scheduler = scheduler;
> +
> + return 0;
> +}
> +
> +/**
> + * i915_scheduler_destroy - Get rid of the scheduler.
> + * @dev: DRM device
> + */
> +void i915_scheduler_destroy(struct drm_i915_private *dev_priv)
> +{
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + int e;
> +
> + if (!scheduler)
> + return;
> +
> + for (e = 0; e < I915_NUM_ENGINES; e++)
> + WARN(!list_empty(&scheduler->node_queue[e]), "Destroying with list entries on engine %d!", e);
> +
> + kfree(scheduler);
> + dev_priv->scheduler = NULL;
> +}
> +
> +/*
> + * Add a popped node back in to the queue. For example, because the engine
> + * was hung when execfinal() was called and thus the engine submission needs
> + * to be retried later.
> + */
> +static void i915_scheduler_node_requeue(struct i915_scheduler *scheduler,
> + struct i915_scheduler_queue_entry *node)
> +{
> + assert_scheduler_lock_held(scheduler);
> +
> + WARN_ON(!I915_SQS_IS_FLYING(node));
> +
> + /* Seqno will be reassigned on relaunch */
> + node->params.request->seqno = 0;
> + node->status = I915_SQS_QUEUED;
> + scheduler->counts[node->params.engine->id].flying--;
> + scheduler->counts[node->params.engine->id].queued++;
> +}
> +
> +/*
> + * Give up on a node completely. For example, because it is causing the
> + * engine to hang or is using some resource that no longer exists.
> + */
> +static void i915_scheduler_node_kill(struct i915_scheduler *scheduler,
> + struct i915_scheduler_queue_entry *node)
> +{
> + assert_scheduler_lock_held(scheduler);
> +
> + WARN_ON(I915_SQS_IS_COMPLETE(node));
> +
> + if (I915_SQS_IS_FLYING(node))
> + scheduler->counts[node->params.engine->id].flying--;
> + else
> + scheduler->counts[node->params.engine->id].queued--;
> +
> + node->status = I915_SQS_DEAD;
> +}
> +
> +/* Mark a node as in flight on the hardware. */
> +static void i915_scheduler_node_fly(struct i915_scheduler_queue_entry *node)
> +{
> + struct drm_i915_private *dev_priv = to_i915(node->params.dev);
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + struct intel_engine_cs *engine = node->params.engine;
> +
> + assert_scheduler_lock_held(scheduler);
> +
> + WARN_ON(node->status != I915_SQS_POPPED);
> +
> + /*
> + * Add the node (which should currently be in state popped) to the
> + * front of the queue. This ensure that flying nodes are always held
> + * in hardware submission order.
> + */
> + list_add(&node->link, &scheduler->node_queue[engine->id]);
> +
> + node->status = I915_SQS_FLYING;
> +
> + scheduler->counts[engine->id].flying++;
> +
> + if (!(scheduler->flags[engine->id] & I915_SF_INTERRUPTS_ENABLED)) {
> + bool success = true;
> +
> + success = engine->irq_get(engine);
> + if (success)
> + scheduler->flags[engine->id] |= I915_SF_INTERRUPTS_ENABLED;
> + }
> +}
> +
> +static inline uint32_t i915_scheduler_count_flying(struct i915_scheduler *scheduler,
> + struct intel_engine_cs *engine)
> +{
> + return scheduler->counts[engine->id].flying;
> +}
Maybe num_flying would be a more obvious name, or at least less worry
inducing.
> +
> +static void i915_scheduler_priority_bump_clear(struct i915_scheduler *scheduler)
> +{
> + struct i915_scheduler_queue_entry *node;
> + int i;
> +
> + assert_scheduler_lock_held(scheduler);
> +
> + /*
> + * Ensure circular dependencies don't cause problems and that a bump
> + * by object usage only bumps each using buffer once:
> + */
> + for (i = 0; i < I915_NUM_ENGINES; i++) {
> + for_each_scheduler_node(node, i)
> + node->bumped = false;
> + }
> +}
> +
> +static int i915_scheduler_priority_bump(struct i915_scheduler *scheduler,
> + struct i915_scheduler_queue_entry *target,
> + uint32_t bump)
> +{
> + uint32_t new_priority;
> + int i, count;
> +
> + if (target->priority >= scheduler->priority_level_max)
> + return 1;
> +
> + if (target->bumped)
> + return 0;
> +
> + new_priority = target->priority + bump;
> + if ((new_priority <= target->priority) ||
> + (new_priority > scheduler->priority_level_max))
> + target->priority = scheduler->priority_level_max;
> + else
> + target->priority = new_priority;
> +
> + count = 1;
> + target->bumped = true;
> +
> + for (i = 0; i < target->num_deps; i++) {
> + if (!target->dep_list[i])
> + continue;
> +
> + if (target->dep_list[i]->bumped)
> + continue;
> +
> + count += i915_scheduler_priority_bump(scheduler,
> + target->dep_list[i],
> + bump);
> + }
> +
> + return count;
> +}
> +
> +/*
> + * Nodes are considered valid dependencies if they are queued on any engine
> + * or if they are in flight on a different engine. In flight on the same
> + * engine is no longer interesting for non-premptive nodes as the engine
> + * serialises execution. For pre-empting nodes, all in flight dependencies
> + * are valid as they must not be jumped by the act of pre-empting.
> + *
> + * Anything that is neither queued nor flying is uninteresting.
> + */
> +static inline bool i915_scheduler_is_dependency_valid(
> + struct i915_scheduler_queue_entry *node, uint32_t idx)
> +{
> + struct i915_scheduler_queue_entry *dep;
> +
> + dep = node->dep_list[idx];
> + if (!dep)
> + return false;
> +
> + if (I915_SQS_IS_QUEUED(dep))
> + return true;
> +
> + if (I915_SQS_IS_FLYING(dep)) {
> + if (node->params.engine != dep->params.engine)
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static int i915_scheduler_pop_from_queue_locked(struct intel_engine_cs *engine,
> + struct i915_scheduler_queue_entry **pop_node)
> +{
> + struct drm_i915_private *dev_priv = to_i915(engine->dev);
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + struct i915_scheduler_queue_entry *best = NULL;
> + struct i915_scheduler_queue_entry *node;
> + int ret;
> + int i;
> + bool any_queued = false;
> + bool has_local, has_remote, only_remote = false;
> +
> + assert_scheduler_lock_held(scheduler);
> +
> + *pop_node = NULL;
Looks to be not needed since there is *pop_node = best below.
> + ret = -ENODATA;
> +
> + for_each_scheduler_node(node, engine->id) {
> + if (!I915_SQS_IS_QUEUED(node))
> + continue;
> + any_queued = true;
> +
> + has_local = false;
> + has_remote = false;
> + for (i = 0; i < node->num_deps; i++) {
> + if (!i915_scheduler_is_dependency_valid(node, i))
> + continue;
> +
> + if (node->dep_list[i]->params.engine == node->params.engine)
> + has_local = true;
> + else
> + has_remote = true;
Would it be worth breaking early from this loop if has_local && has_remote?
> + }
> +
> + if (has_remote && !has_local)
> + only_remote = true;
> +
> + if (!has_local && !has_remote) {
> + if (!best ||
> + (node->priority > best->priority))
> + best = node;
> + }
> + }
> +
> + if (best) {
> + list_del(&best->link);
> +
> + INIT_LIST_HEAD(&best->link);
list_del_init can replace the two lines above.
> + best->status = I915_SQS_POPPED;
> +
> + scheduler->counts[engine->id].queued--;
> +
> + ret = 0;
> + } else {
> + /* Can only get here if:
> + * (a) there are no buffers in the queue
> + * (b) all queued buffers are dependent on other buffers
> + * e.g. on a buffer that is in flight on a different engine
> + */
> + if (only_remote) {
> + /* The only dependent buffers are on another engine. */
> + ret = -EAGAIN;
> + } else if (any_queued) {
> + /* It seems that something has gone horribly wrong! */
> + WARN_ONCE(true, "Broken dependency tracking on engine %d!\n",
> + (int) engine->id);
> + }
> + }
> +
> + *pop_node = best;
> + return ret;
> +}
> +
> +/*
> + * NB: The driver mutex lock must be held before calling this function. It is
> + * only really required during the actual back end submission call. However,
> + * attempting to acquire a mutex while holding a spin lock is a Bad Idea.
> + * And releasing the one before acquiring the other leads to other code
> + * being run and interfering.
> + */
> +static int i915_scheduler_submit(struct intel_engine_cs *engine)
> +{
> + struct drm_i915_private *dev_priv = to_i915(engine->dev);
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + struct i915_scheduler_queue_entry *node;
> + int ret, count = 0, flying;
> +
> + WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex));
> +
> + spin_lock_irq(&scheduler->lock);
> +
> + WARN_ON(scheduler->flags[engine->id] & I915_SF_SUBMITTING);
> + scheduler->flags[engine->id] |= I915_SF_SUBMITTING;
> +
> + /* First time around, complain if anything unexpected occurs: */
> + ret = i915_scheduler_pop_from_queue_locked(engine, &node);
> + if (ret)
> + goto error;
> +
> + do {
> + WARN_ON(node->params.engine != engine);
> + WARN_ON(node->status != I915_SQS_POPPED);
> + count++;
> +
> + /*
> + * The call to pop above will have removed the node from the
> + * list. So add it back in and mark it as in flight.
> + */
> + i915_scheduler_node_fly(node);
> +
> + spin_unlock_irq(&scheduler->lock);
> + ret = dev_priv->gt.execbuf_final(&node->params);
> + spin_lock_irq(&scheduler->lock);
> +
> + /*
> + * Handle failed submission but first check that the
> + * watchdog/reset code has not nuked the node while we
> + * weren't looking:
> + */
> + if (ret && (node->status != I915_SQS_DEAD)) {
> + bool requeue = true;
> +
> + /*
> + * Oh dear! Either the node is broken or the engine is
> + * busy. So need to kill the node or requeue it and try
> + * again later as appropriate.
> + */
> +
> + switch (-ret) {
> + case ENODEV:
> + case ENOENT:
> + /* Fatal errors. Kill the node. */
> + requeue = false;
> + i915_scheduler_node_kill(scheduler, node);
> + break;
> +
> + case EAGAIN:
> + case EBUSY:
> + case EIO:
> + case ENOMEM:
> + case ERESTARTSYS:
> + case EINTR:
> + /* Supposedly recoverable errors. */
> + break;
> +
> + default:
> + /*
> + * Assume the error is recoverable and hope
> + * for the best.
> + */
> + MISSING_CASE(-ret);
> + break;
> + }
> +
> + if (requeue) {
> + i915_scheduler_node_requeue(scheduler, node);
> + /*
> + * No point spinning if the engine is currently
> + * unavailable so just give up and come back
> + * later.
> + */
> + break;
> + }
> + }
> +
> + /* Keep launching until the sky is sufficiently full. */
> + flying = i915_scheduler_count_flying(scheduler, engine);
> + if (flying >= scheduler->min_flying)
> + break;
> +
> + /* Grab another node and go round again... */
> + ret = i915_scheduler_pop_from_queue_locked(engine, &node);
> + } while (ret == 0);
> +
> + /* Don't complain about not being able to submit extra entries */
> + if (ret == -ENODATA)
> + ret = 0;
> +
> + /*
> + * Bump the priority of everything that was not submitted to prevent
> + * starvation of low priority tasks by a spamming high priority task.
> + */
> + i915_scheduler_priority_bump_clear(scheduler);
> + for_each_scheduler_node(node, engine->id) {
> + if (!I915_SQS_IS_QUEUED(node))
> + continue;
> +
> + i915_scheduler_priority_bump(scheduler, node,
> + scheduler->priority_level_bump);
> + }
bump_clear will iterate queues for all engines and then you iterate it
again. Would this be equivalent:
for (i = 0; i < I915_NUM_ENGINES; i++) {
for_each_scheduler_node(node, i) {
node->bumped = false;
if (i == engine->id && I915_SQS_IS_QUEUED(node))
i915_scheduler_priority_bump(scheduler, node,
scheduler->priority_level_bump);
}
}
Advantage is one loop fewer.
> +
> + /* On success, return the number of buffers submitted. */
> + if (ret == 0)
> + ret = count;
> +
> +error:
> + scheduler->flags[engine->id] &= ~I915_SF_SUBMITTING;
> + spin_unlock_irq(&scheduler->lock);
> + return ret;
> +}
> +
> +static void i915_generate_dependencies(struct i915_scheduler *scheduler,
> + struct i915_scheduler_queue_entry *node,
> + uint32_t engine)
> +{
> + struct i915_scheduler_obj_entry *this, *that;
> + struct i915_scheduler_queue_entry *test;
> + int i, j;
> + bool found;
> +
> + for_each_scheduler_node(test, engine) {
> + if (I915_SQS_IS_COMPLETE(test))
> + continue;
> +
> + /*
> + * Batches on the same engine for the same
> + * context must be kept in order.
> + */
> + found = (node->params.ctx == test->params.ctx) &&
> + (node->params.engine == test->params.engine);
> +
> + /*
> + * Batches working on the same objects must
> + * be kept in order.
> + */
> + for (i = 0; (i < node->num_objs) && !found; i++) {
> + this = node->objs + i;
> +
> + for (j = 0; j < test->num_objs; j++) {
> + that = test->objs + j;
> +
> + if (this->obj != that->obj)
> + continue;
> +
> + /* Only need to worry about writes */
> + if (this->read_only && that->read_only)
> + continue;
> +
> + found = true;
> + break;
> + }
> + }
> +
> + if (found) {
> + node->dep_list[node->num_deps] = test;
> + node->num_deps++;
> + }
> + }
> +}
> +
> +static int i915_scheduler_queue_execbuffer_bypass(struct i915_scheduler_queue_entry *qe)
> +{
> + struct drm_i915_private *dev_priv = to_i915(qe->params.dev);
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + int ret;
> +
> + scheduler->flags[qe->params.engine->id] |= I915_SF_SUBMITTING;
> + ret = dev_priv->gt.execbuf_final(&qe->params);
> + scheduler->flags[qe->params.engine->id] &= ~I915_SF_SUBMITTING;
> +
> + /*
> + * Don't do any clean up on failure because the caller will
> + * do it all anyway.
> + */
> + if (ret)
> + return ret;
> +
> + /* Need to release any resources held by the node: */
> + qe->status = I915_SQS_COMPLETE;
> + i915_scheduler_clean_node(qe);
> +
> + return 0;
> +}
> +
> +static inline uint32_t i915_scheduler_count_incomplete(struct i915_scheduler *scheduler)
> +{
> + int e, incomplete = 0;
> +
> + for (e = 0; e < I915_NUM_ENGINES; e++)
> + incomplete += scheduler->counts[e].queued + scheduler->counts[e].flying;
> +
> + return incomplete;
> +}
> +
> +/**
> + * i915_scheduler_queue_execbuffer - Submit a batch buffer request to the
> + * scheduler.
> + * @qe: The batch buffer request to be queued.
> + * The expectation is the qe passed in is a local stack variable. This
> + * function will copy its contents into a freshly allocated list node. The
> + * new node takes ownership of said contents so the original qe should simply
> + * be discarded and not cleaned up (i.e. don't free memory it points to or
> + * dereference objects it holds). The node is added to the scheduler's queue
> + * and the batch buffer will be submitted to the hardware at some future
> + * point in time (which may be immediately, before returning or may be quite
> + * a lot later).
> + */
> +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe)
> +{
> + struct drm_i915_private *dev_priv = to_i915(qe->params.dev);
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + struct intel_engine_cs *engine = qe->params.engine;
> + struct i915_scheduler_queue_entry *node;
> + bool not_flying;
> + int i, e;
> + int incomplete;
> +
> + /* Bypass the scheduler and send the buffer immediately? */
> + if (1/*!i915.enable_scheduler*/)
> + return i915_scheduler_queue_execbuffer_bypass(qe);
> +
> + node = kmalloc(sizeof(*node), GFP_KERNEL);
> + if (!node)
> + return -ENOMEM;
> +
> + *node = *qe;
> + INIT_LIST_HEAD(&node->link);
> + node->status = I915_SQS_QUEUED;
> + node->stamp = jiffies;
> + i915_gem_request_reference(node->params.request);
> +
> + WARN_ON(node->params.request->scheduler_qe);
> + node->params.request->scheduler_qe = node;
> +
> + /*
> + * Need to determine the number of incomplete entries in the list as
> + * that will be the maximum size of the dependency list.
> + *
> + * Note that the allocation must not be made with the spinlock acquired
> + * as kmalloc can sleep. However, the unlock/relock is safe because no
> + * new entries can be queued up during the unlock as the i915 driver
> + * mutex is still held. Entries could be removed from the list but that
> + * just means the dep_list will be over-allocated which is fine.
> + */
> + spin_lock_irq(&scheduler->lock);
> + incomplete = i915_scheduler_count_incomplete(scheduler);
> +
> + /* Temporarily unlock to allocate memory: */
> + spin_unlock_irq(&scheduler->lock);
> + if (incomplete) {
> + node->dep_list = kmalloc_array(incomplete,
> + sizeof(*node->dep_list),
> + GFP_KERNEL);
> + if (!node->dep_list) {
> + kfree(node);
> + return -ENOMEM;
> + }
> + } else
> + node->dep_list = NULL;
> +
> + spin_lock_irq(&scheduler->lock);
> + node->num_deps = 0;
> +
> + if (node->dep_list) {
> + for (e = 0; e < I915_NUM_ENGINES; e++)
> + i915_generate_dependencies(scheduler, node, e);
> +
> + WARN_ON(node->num_deps > incomplete);
> + }
> +
> + node->priority = clamp(node->priority,
> + scheduler->priority_level_min,
> + scheduler->priority_level_max);
> +
> + if ((node->priority > 0) && node->num_deps) {
> + i915_scheduler_priority_bump_clear(scheduler);
> +
> + for (i = 0; i < node->num_deps; i++)
> + i915_scheduler_priority_bump(scheduler,
> + node->dep_list[i], node->priority);
> + }
> +
> + list_add_tail(&node->link, &scheduler->node_queue[engine->id]);
> +
> + not_flying = i915_scheduler_count_flying(scheduler, engine) <
> + scheduler->min_flying;
> +
> + scheduler->counts[engine->id].queued++;
> +
> + spin_unlock_irq(&scheduler->lock);
> +
> + if (not_flying)
> + i915_scheduler_submit(engine);
> +
> + return 0;
> +}
> +
> +/**
> + * i915_scheduler_notify_request - Notify the scheduler that the given
> + * request has completed on the hardware.
> + * @req: Request structure which has completed
> + * @preempt: Did it complete pre-emptively?
> + * A sequence number has popped out of the hardware and the request handling
> + * code has mapped it back to a request and will mark that request complete.
> + * It also calls this function to notify the scheduler about the completion
> + * so the scheduler's node can be updated appropriately.
> + * Returns true if the request is scheduler managed, false if not. The return
> + * value is combined for all freshly completed requests and if any were true
> + * then i915_scheduler_wakeup() is called so the scheduler can do further
> + * processing (submit more work) at the end.
> + */
> +bool i915_scheduler_notify_request(struct drm_i915_gem_request *req)
> +{
> + struct drm_i915_private *dev_priv = to_i915(req->engine->dev);
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + struct i915_scheduler_queue_entry *node = req->scheduler_qe;
> + unsigned long flags;
> +
> + if (!node)
> + return false;
> +
> + spin_lock_irqsave(&scheduler->lock, flags);
> +
> + WARN_ON(!I915_SQS_IS_FLYING(node));
> +
> + /* Node was in flight so mark it as complete. */
> + if (req->cancelled)
> + node->status = I915_SQS_DEAD;
> + else
> + node->status = I915_SQS_COMPLETE;
> +
> + scheduler->counts[req->engine->id].flying--;
> +
> + spin_unlock_irqrestore(&scheduler->lock, flags);
> +
> + return true;
> +}
> +
> +static int i915_scheduler_remove_dependent(struct i915_scheduler *scheduler,
> + struct i915_scheduler_queue_entry *remove)
> +{
> + struct i915_scheduler_queue_entry *node;
> + int i, r;
> + int count = 0;
> +
> + /*
> + * Ensure that a node is not being removed which is still dependent
> + * upon other (not completed) work. If that happens, it implies
> + * something has gone very wrong with the dependency tracking! Note
> + * that there is no need to worry if this node has been explicitly
> + * killed for some reason - it might be being killed before it got
> + * sent to the hardware.
> + */
> + if (remove->status != I915_SQS_DEAD) {
> + for (i = 0; i < remove->num_deps; i++)
> + if ((remove->dep_list[i]) &&
> + (!I915_SQS_IS_COMPLETE(remove->dep_list[i])))
> + count++;
> + WARN_ON(count);
> + }
> +
> + /*
> + * Remove this node from the dependency lists of any other node which
> + * might be waiting on it.
> + */
> + for (r = 0; r < I915_NUM_ENGINES; r++) {
> + for_each_scheduler_node(node, r) {
> + for (i = 0; i < node->num_deps; i++) {
> + if (node->dep_list[i] != remove)
> + continue;
> +
> + node->dep_list[i] = NULL;
Can the same node be listed in other's node dependency list multiple
times? If not you could break after clearing it and not iterate the rest
of the list.
> + }
> + }
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * i915_scheduler_wakeup - wake the scheduler's worker thread
> + * @dev: DRM device
> + * Called at the end of seqno interrupt processing if any request has
> + * completed that corresponds to a scheduler node.
> + */
> +void i915_scheduler_wakeup(struct drm_device *dev)
> +{
> + /* XXX: Need to call i915_scheduler_remove() via work handler. */
> +}
> +
> +/**
> + * i915_scheduler_clean_node - free up any allocations/references
> + * associated with the given scheduler queue entry.
> + * @node: Queue entry structure which is complete
> + * After a give batch buffer completes on the hardware, all the information
> + * required to resubmit it is no longer required. However, the node entry
> + * itself might still be required for tracking purposes for a while longer.
> + * This function should be called as soon as the node is known to be complete
> + * so that these resources may be freed even though the node itself might
> + * hang around.
> + */
> +void i915_scheduler_clean_node(struct i915_scheduler_queue_entry *node)
> +{
> + if (!I915_SQS_IS_COMPLETE(node)) {
> + WARN(!node->params.request->cancelled,
> + "Cleaning active node: %d!\n", node->status);
> + return;
> + }
> +
> + if (node->params.batch_obj) {
> + /*
> + * The batch buffer must be unpinned before it is unreferenced
> + * otherwise the unpin fails with a missing vma!?
> + */
> + if (node->params.dispatch_flags & I915_DISPATCH_SECURE)
> + i915_gem_execbuf_release_batch_obj(node->params.batch_obj);
> +
> + node->params.batch_obj = NULL;
> + }
> +
> + /* And anything else owned by the node: */
> + if (node->params.cliprects) {
> + kfree(node->params.cliprects);
> + node->params.cliprects = NULL;
> + }
> +}
> +
> +static bool i915_scheduler_remove(struct i915_scheduler *scheduler,
> + struct intel_engine_cs *engine,
> + struct list_head *remove)
> +{
> + struct i915_scheduler_queue_entry *node, *node_next;
> + bool do_submit;
> +
> + spin_lock_irq(&scheduler->lock);
> +
> + INIT_LIST_HEAD(remove);
> + list_for_each_entry_safe(node, node_next, &scheduler->node_queue[engine->id], link) {
> + if (!I915_SQS_IS_COMPLETE(node))
> + break;
> +
> + list_del(&node->link);
> + list_add(&node->link, remove);
list_move
> +
> + /* Strip the dependency info while the mutex is still locked */
> + i915_scheduler_remove_dependent(scheduler, node);
> +
> + continue;
Could kill the continue. :)
> + }
> +
> + /*
> + * Release the interrupt reference count if there are no longer any
> + * nodes to worry about.
> + */
> + if (list_empty(&scheduler->node_queue[engine->id]) &&
> + (scheduler->flags[engine->id] & I915_SF_INTERRUPTS_ENABLED)) {
> + engine->irq_put(engine);
> + scheduler->flags[engine->id] &= ~I915_SF_INTERRUPTS_ENABLED;
> + }
> +
> + /* Launch more packets now? */
> + do_submit = (scheduler->counts[engine->id].queued > 0) &&
> + (scheduler->counts[engine->id].flying < scheduler->min_flying);
> +
> + spin_unlock_irq(&scheduler->lock);
> +
> + return do_submit;
> +}
> +
> +void i915_scheduler_process_work(struct intel_engine_cs *engine)
> +{
> + struct drm_i915_private *dev_priv = to_i915(engine->dev);
> + struct i915_scheduler *scheduler = dev_priv->scheduler;
> + struct i915_scheduler_queue_entry *node;
> + bool do_submit;
> + struct list_head remove;
Could do LIST_HEAD(remove); to declare an empty list and then wouldn't
have to do INIT_LIST_HEAD in i915_scheduler_remove. Would probably be
better to group the logic together like that.
> +
> + if (list_empty(&scheduler->node_queue[engine->id]))
> + return;
> +
> + /* Remove completed nodes. */
> + do_submit = i915_scheduler_remove(scheduler, engine, &remove);
> +
> + if (!do_submit && list_empty(&remove))
> + return;
> +
> + /* Need to grab the pm lock outside of the mutex lock */
> + if (do_submit)
> + intel_runtime_pm_get(dev_priv);
> +
> + mutex_lock(&engine->dev->struct_mutex);
> +
> + if (do_submit)
> + i915_scheduler_submit(engine);
> +
> + while (!list_empty(&remove)) {
> + node = list_first_entry(&remove, typeof(*node), link);
> + list_del(&node->link);
> +
> + /* Free up all the DRM references */
> + i915_scheduler_clean_node(node);
> +
> + /* And anything else owned by the node: */
> + node->params.request->scheduler_qe = NULL;
> + i915_gem_request_unreference(node->params.request);
> + kfree(node->dep_list);
> + kfree(node);
> + }
> +
> + mutex_unlock(&engine->dev->struct_mutex);
> +
> + if (do_submit)
> + intel_runtime_pm_put(dev_priv);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> new file mode 100644
> index 0000000..c895c4c
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -0,0 +1,113 @@
> +/*
> + * Copyright (c) 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#ifndef _I915_SCHEDULER_H_
> +#define _I915_SCHEDULER_H_
> +
> +enum i915_scheduler_queue_status {
> + /* Limbo: */
> + I915_SQS_NONE = 0,
> + /* Not yet submitted to hardware: */
> + I915_SQS_QUEUED,
> + /* Popped from queue, ready to fly: */
> + I915_SQS_POPPED,
> + /* Sent to hardware for processing: */
> + I915_SQS_FLYING,
> + /* Finished processing on the hardware: */
> + I915_SQS_COMPLETE,
> + /* Killed by watchdog or catastrophic submission failure: */
> + I915_SQS_DEAD,
> + /* Limit value for use with arrays/loops */
> + I915_SQS_MAX
> +};
> +
> +#define I915_SQS_IS_QUEUED(node) (((node)->status == I915_SQS_QUEUED))
> +#define I915_SQS_IS_FLYING(node) (((node)->status == I915_SQS_FLYING))
> +#define I915_SQS_IS_COMPLETE(node) (((node)->status == I915_SQS_COMPLETE) || \
> + ((node)->status == I915_SQS_DEAD))
> +
> +struct i915_scheduler_obj_entry {
> + struct drm_i915_gem_object *obj;
> + bool read_only;
> +};
> +
> +struct i915_scheduler_queue_entry {
> + /* Any information required to submit this batch buffer to the hardware */
> + struct i915_execbuffer_params params;
> +
> + /* -1023 = lowest priority, 0 = default, 1023 = highest */
> + int32_t priority;
> + bool bumped;
> +
> + /* Objects referenced by this batch buffer */
> + struct i915_scheduler_obj_entry *objs;
> + int num_objs;
> +
> + /* Batch buffers this one is dependent upon */
> + struct i915_scheduler_queue_entry **dep_list;
> + int num_deps;
> +
> + enum i915_scheduler_queue_status status;
> + unsigned long stamp;
> +
> + /* List of all scheduler queue entry nodes */
> + struct list_head link;
> +};
> +
> +struct i915_scheduler_node_states {
> + uint32_t flying;
> + uint32_t queued;
> +};
> +
> +struct i915_scheduler {
> + struct list_head node_queue[I915_NUM_ENGINES];
> + uint32_t flags[I915_NUM_ENGINES];
> + spinlock_t lock;
> +
> + /* Node counts: */
> + struct i915_scheduler_node_states counts[I915_NUM_ENGINES];
> +
> + /* Tuning parameters: */
> + int32_t priority_level_min;
> + int32_t priority_level_max;
> + int32_t priority_level_bump;
> + int32_t priority_level_preempt;
> + uint32_t min_flying;
> +};
> +
> +/* Flag bits for i915_scheduler::flags */
> +enum {
> + I915_SF_INTERRUPTS_ENABLED = (1 << 0),
> + I915_SF_SUBMITTING = (1 << 1),
> +};
> +
> +bool i915_scheduler_is_enabled(struct drm_device *dev);
> +int i915_scheduler_init(struct drm_device *dev);
> +void i915_scheduler_destroy(struct drm_i915_private *dev_priv);
> +void i915_scheduler_clean_node(struct i915_scheduler_queue_entry *node);
> +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe);
> +bool i915_scheduler_notify_request(struct drm_i915_gem_request *req);
> +void i915_scheduler_wakeup(struct drm_device *dev);
> +
> +#endif /* _I915_SCHEDULER_H_ */
>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list