[Intel-gfx] [PATCH] drm/i915: Force CPU synchronisation even if userspace requests ASYNC

Jason Ekstrand jason at jlekstrand.net
Tue Jul 11 00:01:20 UTC 2017


On Fri, Jul 7, 2017 at 10:17 AM, Chris Wilson <chris at chris-wilson.co.uk>
wrote:

> The goal here was to minimise doing any thing or any check inside the
> kernel that was not strictly required. For a userspace that assumes
> complete control over the cache domains, the kernel is usually using
> outdated information and may trigger clflushes where none were
> required.
>
> However, swapping is a situation where userspace has no knowledge of the
> domain transfer, and will leave the object in the CPU cache. The kernel
> must flush this out to the backing storage prior to use with the GPU. As
> we use an asynchronous task tracked by an implicit fence for this, we
> also need to cancel the ASYNC flag on the object so that the object will
> wait for the clflush to complete before being executed. This also absolves
> userspace of the responsibility imposed by commit 77ae9957897d ("drm/i915:
> Enable userspace to opt-out of implicit fencing") that its needed to ensure
> that the object was out of the CPU cache prior to use on the GPU.
>

Given that domain tracking is global, we can also run into interesting
issues if process A does a CPU map, writes to it, and then hands it to
process B which uses it from the GPU.  Without being very aggressive about
set_domain, we have no knowledge that this is a problem.


> Fixes: 77ae9957897d ("drm/i915: Enable userspace to opt-out of implicit
> fencing")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101571
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Jason Ekstrand <jason at jlekstrand.net>
>

Chris and I discussed this for a while on Friday and I think I understand
the problem and I think this is the correct fix.

Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

That said, I don't think I'm really qualified to review as I don't
understand all of the details.

--Jason


> ---
>  drivers/gpu/drm/i915/i915_gem_clflush.c    |  7 ++++---
>  drivers/gpu/drm/i915/i915_gem_clflush.h    |  2 +-
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 ++++++----
>  3 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c
> b/drivers/gpu/drm/i915/i915_gem_clflush.c
> index 152f16c11878..348b29a845c9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_clflush.c
> +++ b/drivers/gpu/drm/i915/i915_gem_clflush.c
> @@ -114,7 +114,7 @@ i915_clflush_notify(struct i915_sw_fence *fence,
>         return NOTIFY_DONE;
>  }
>
> -void i915_gem_clflush_object(struct drm_i915_gem_object *obj,
> +bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
>                              unsigned int flags)
>  {
>         struct clflush *clflush;
> @@ -128,7 +128,7 @@ void i915_gem_clflush_object(struct
> drm_i915_gem_object *obj,
>          */
>         if (!i915_gem_object_has_struct_page(obj)) {
>                 obj->cache_dirty = false;
> -               return;
> +               return false;
>         }
>
>         /* If the GPU is snooping the contents of the CPU cache,
> @@ -140,7 +140,7 @@ void i915_gem_clflush_object(struct
> drm_i915_gem_object *obj,
>          * tracking.
>          */
>         if (!(flags & I915_CLFLUSH_FORCE) && obj->cache_coherent)
> -               return;
> +               return false;
>
>         trace_i915_gem_object_clflush(obj);
>
> @@ -179,4 +179,5 @@ void i915_gem_clflush_object(struct
> drm_i915_gem_object *obj,
>         }
>
>         obj->cache_dirty = false;
> +       return true;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.h
> b/drivers/gpu/drm/i915/i915_gem_clflush.h
> index 2455a7820937..f390247561b3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_clflush.h
> +++ b/drivers/gpu/drm/i915/i915_gem_clflush.h
> @@ -28,7 +28,7 @@
>  struct drm_i915_private;
>  struct drm_i915_gem_object;
>
> -void i915_gem_clflush_object(struct drm_i915_gem_object *obj,
> +bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
>                              unsigned int flags);
>  #define I915_CLFLUSH_FORCE BIT(0)
>  #define I915_CLFLUSH_SYNC BIT(1)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 2c76d6980855..aeacfe9a0878 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1826,7 +1826,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
>         int err;
>
>         for (i = 0; i < count; i++) {
> -               const struct drm_i915_gem_exec_object2 *entry =
> &eb->exec[i];
> +               struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
>                 struct i915_vma *vma = exec_to_vma(entry);
>                 struct drm_i915_gem_object *obj = vma->obj;
>
> @@ -1842,12 +1842,14 @@ static int eb_move_to_gpu(struct i915_execbuffer
> *eb)
>                         eb->request->capture_list = capture;
>                 }
>
> +               if (unlikely(obj->cache_dirty && !obj->cache_coherent)) {
> +                       if (i915_gem_clflush_object(obj, 0))
> +                               entry->flags &= ~EXEC_OBJECT_ASYNC;
> +               }
> +
>                 if (entry->flags & EXEC_OBJECT_ASYNC)
>                         goto skip_flushes;
>
> -               if (unlikely(obj->cache_dirty && !obj->cache_coherent))
> -                       i915_gem_clflush_object(obj, 0);
> -
>                 err = i915_gem_request_await_object
>                         (eb->request, obj, entry->flags &
> EXEC_OBJECT_WRITE);
>                 if (err)
> --
> 2.13.2
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20170710/e07262cc/attachment.html>


More information about the Intel-gfx mailing list