[Intel-gfx] [PATCH 4/5] drm/i915: Park the breadcrumbs signaler across a GPU reset
Mika Kuoppala
mika.kuoppala at linux.intel.com
Mon Feb 13 10:03:02 UTC 2017
Chris Wilson <chris at chris-wilson.co.uk> writes:
> The signal threads may be running concurrently with the GPU reset. The
> completion from the GPU run asynchronous with the reset and two threads
> may see different snapshots of the state, and the signaler may mark a
> request as complete as we try to reset it. We don't tolerate 2 different
> views of the same state and complain if we try to mark a request as
> failed if it is already complete. Disable the signal threads during
> reset to prevent this conflict (even though the conflict implies that
> the state we resetting to is invalid, we have already made our
> decision!).
>
> References: https://bugs.freedesktop.org/show_bug.cgi?id=99671
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 16 +++++++++++++++-
> drivers/gpu/drm/i915/intel_breadcrumbs.c | 3 +++
> 2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3c2c2296c1dc..730804ce9610 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -35,6 +35,7 @@
> #include "intel_frontbuffer.h"
> #include "intel_mocs.h"
> #include <linux/dma-fence-array.h>
> +#include <linux/kthread.h>
> #include <linux/reservation.h>
> #include <linux/shmem_fs.h>
> #include <linux/slab.h>
> @@ -2643,6 +2644,17 @@ int i915_gem_reset_prepare(struct drm_i915_private *dev_priv)
> for_each_engine(engine, dev_priv, id) {
> struct drm_i915_gem_request *request;
>
> + /* Prevent the signaler thread from updating the request
> + * state (by calling dma_fence_signal) as we are processing
> + * the reset. The write from the GPU of the seqno is
> + * asynchronous and the signaler thread may see a different
> + * value to us and declare the request complete, even though
> + * the reset routine have picked that request as the active
> + * (incomplete) request. This conflict is not handled
> + * gracefully!
> + */
> + kthread_park(engine->breadcrumbs.signaler);
> +
> /* Prevent request submission to the hardware until we have
> * completed the reset in i915_gem_reset_finish(). If a request
> * is completed by one engine, it may then queue a request
> @@ -2796,8 +2808,10 @@ void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
>
> lockdep_assert_held(&dev_priv->drm.struct_mutex);
>
> - for_each_engine(engine, dev_priv, id)
> + for_each_engine(engine, dev_priv, id) {
> tasklet_enable(&engine->irq_tasklet);
> + kthread_unpark(engine->breadcrumbs.signaler);
> + }
> }
>
> static void nop_submit_request(struct drm_i915_gem_request *request)
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 9fd002bcebb6..f5e05343110a 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -490,6 +490,9 @@ static int intel_breadcrumbs_signaler(void *arg)
> break;
>
> schedule();
> +
> + if (kthread_should_park())
> + kthread_parkme();
> }
> } while (1);
> __set_current_state(TASK_RUNNING);
> --
> 2.11.0
More information about the Intel-gfx
mailing list