[Intel-gfx] [PATCH] drm/i915: Force CPU synchronisation even if userspace requests ASYNC
Jason Ekstrand
jason at jlekstrand.net
Tue Jul 11 00:01:20 UTC 2017
On Fri, Jul 7, 2017 at 10:17 AM, Chris Wilson <chris at chris-wilson.co.uk>
wrote:
> The goal here was to minimise doing any thing or any check inside the
> kernel that was not strictly required. For a userspace that assumes
> complete control over the cache domains, the kernel is usually using
> outdated information and may trigger clflushes where none were
> required.
>
> However, swapping is a situation where userspace has no knowledge of the
> domain transfer, and will leave the object in the CPU cache. The kernel
> must flush this out to the backing storage prior to use with the GPU. As
> we use an asynchronous task tracked by an implicit fence for this, we
> also need to cancel the ASYNC flag on the object so that the object will
> wait for the clflush to complete before being executed. This also absolves
> userspace of the responsibility imposed by commit 77ae9957897d ("drm/i915:
> Enable userspace to opt-out of implicit fencing") that its needed to ensure
> that the object was out of the CPU cache prior to use on the GPU.
>
Given that domain tracking is global, we can also run into interesting
issues if process A does a CPU map, writes to it, and then hands it to
process B which uses it from the GPU. Without being very aggressive about
set_domain, we have no knowledge that this is a problem.
> Fixes: 77ae9957897d ("drm/i915: Enable userspace to opt-out of implicit
> fencing")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101571
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Jason Ekstrand <jason at jlekstrand.net>
>
Chris and I discussed this for a while on Friday and I think I understand
the problem and I think this is the correct fix.
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
That said, I don't think I'm really qualified to review as I don't
understand all of the details.
--Jason
> ---
> drivers/gpu/drm/i915/i915_gem_clflush.c | 7 ++++---
> drivers/gpu/drm/i915/i915_gem_clflush.h | 2 +-
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 ++++++----
> 3 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c
> b/drivers/gpu/drm/i915/i915_gem_clflush.c
> index 152f16c11878..348b29a845c9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_clflush.c
> +++ b/drivers/gpu/drm/i915/i915_gem_clflush.c
> @@ -114,7 +114,7 @@ i915_clflush_notify(struct i915_sw_fence *fence,
> return NOTIFY_DONE;
> }
>
> -void i915_gem_clflush_object(struct drm_i915_gem_object *obj,
> +bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
> unsigned int flags)
> {
> struct clflush *clflush;
> @@ -128,7 +128,7 @@ void i915_gem_clflush_object(struct
> drm_i915_gem_object *obj,
> */
> if (!i915_gem_object_has_struct_page(obj)) {
> obj->cache_dirty = false;
> - return;
> + return false;
> }
>
> /* If the GPU is snooping the contents of the CPU cache,
> @@ -140,7 +140,7 @@ void i915_gem_clflush_object(struct
> drm_i915_gem_object *obj,
> * tracking.
> */
> if (!(flags & I915_CLFLUSH_FORCE) && obj->cache_coherent)
> - return;
> + return false;
>
> trace_i915_gem_object_clflush(obj);
>
> @@ -179,4 +179,5 @@ void i915_gem_clflush_object(struct
> drm_i915_gem_object *obj,
> }
>
> obj->cache_dirty = false;
> + return true;
> }
> diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.h
> b/drivers/gpu/drm/i915/i915_gem_clflush.h
> index 2455a7820937..f390247561b3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_clflush.h
> +++ b/drivers/gpu/drm/i915/i915_gem_clflush.h
> @@ -28,7 +28,7 @@
> struct drm_i915_private;
> struct drm_i915_gem_object;
>
> -void i915_gem_clflush_object(struct drm_i915_gem_object *obj,
> +bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
> unsigned int flags);
> #define I915_CLFLUSH_FORCE BIT(0)
> #define I915_CLFLUSH_SYNC BIT(1)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 2c76d6980855..aeacfe9a0878 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1826,7 +1826,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
> int err;
>
> for (i = 0; i < count; i++) {
> - const struct drm_i915_gem_exec_object2 *entry =
> &eb->exec[i];
> + struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
> struct i915_vma *vma = exec_to_vma(entry);
> struct drm_i915_gem_object *obj = vma->obj;
>
> @@ -1842,12 +1842,14 @@ static int eb_move_to_gpu(struct i915_execbuffer
> *eb)
> eb->request->capture_list = capture;
> }
>
> + if (unlikely(obj->cache_dirty && !obj->cache_coherent)) {
> + if (i915_gem_clflush_object(obj, 0))
> + entry->flags &= ~EXEC_OBJECT_ASYNC;
> + }
> +
> if (entry->flags & EXEC_OBJECT_ASYNC)
> goto skip_flushes;
>
> - if (unlikely(obj->cache_dirty && !obj->cache_coherent))
> - i915_gem_clflush_object(obj, 0);
> -
> err = i915_gem_request_await_object
> (eb->request, obj, entry->flags &
> EXEC_OBJECT_WRITE);
> if (err)
> --
> 2.13.2
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20170710/e07262cc/attachment.html>
More information about the Intel-gfx
mailing list