<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 7, 2017 at 10:17 AM, Chris Wilson <span dir="ltr"><<a href="mailto:chris@chris-wilson.co.uk" target="_blank">chris@chris-wilson.co.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The goal here was to minimise doing any thing or any check inside the<br>
kernel that was not strictly required. For a userspace that assumes<br>
complete control over the cache domains, the kernel is usually using<br>
outdated information and may trigger clflushes where none were<br>
required.<br>
<br>
However, swapping is a situation where userspace has no knowledge of the<br>
domain transfer, and will leave the object in the CPU cache. The kernel<br>
must flush this out to the backing storage prior to use with the GPU. As<br>
we use an asynchronous task tracked by an implicit fence for this, we<br>
also need to cancel the ASYNC flag on the object so that the object will<br>
wait for the clflush to complete before being executed. This also absolves<br>
userspace of the responsibility imposed by commit 77ae9957897d ("drm/i915:<br>
Enable userspace to opt-out of implicit fencing") that its needed to ensure<br>
that the object was out of the CPU cache prior to use on the GPU.<br></blockquote><div><br></div><div>Given that domain tracking is global, we can also run into interesting issues if process A does a CPU map, writes to it, and then hands it to process B which uses it from the GPU. Without being very aggressive about set_domain, we have no knowledge that this is a problem.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Fixes: 77ae9957897d ("drm/i915: Enable userspace to opt-out of implicit fencing")<br>
Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=101571" rel="noreferrer" target="_blank">https://bugs.freedesktop.org/<wbr>show_bug.cgi?id=101571</a><br>
Signed-off-by: Chris Wilson <<a href="mailto:chris@chris-wilson.co.uk">chris@chris-wilson.co.uk</a>><br>
Cc: Joonas Lahtinen <<a href="mailto:joonas.lahtinen@linux.intel.com">joonas.lahtinen@linux.intel.<wbr>com</a>><br>
Cc: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br></blockquote><div><br></div><div>Chris and I discussed this for a while on Friday and I think I understand the problem and I think this is the correct fix.<br><br></div><div>Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br><br></div><div>That said, I don't think I'm really qualified to review as I don't understand all of the details.<br><br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
---<br>
drivers/gpu/drm/i915/i915_gem_<wbr>clflush.c | 7 ++++---<br>
drivers/gpu/drm/i915/i915_gem_<wbr>clflush.h | 2 +-<br>
drivers/gpu/drm/i915/i915_gem_<wbr>execbuffer.c | 10 ++++++----<br>
3 files changed, 11 insertions(+), 8 deletions(-)<br>
<br>
diff --git a/drivers/gpu/drm/i915/i915_<wbr>gem_clflush.c b/drivers/gpu/drm/i915/i915_<wbr>gem_clflush.c<br>
index 152f16c11878..348b29a845c9 100644<br>
--- a/drivers/gpu/drm/i915/i915_<wbr>gem_clflush.c<br>
+++ b/drivers/gpu/drm/i915/i915_<wbr>gem_clflush.c<br>
@@ -114,7 +114,7 @@ i915_clflush_notify(struct i915_sw_fence *fence,<br>
return NOTIFY_DONE;<br>
}<br>
<br>
-void i915_gem_clflush_object(struct drm_i915_gem_object *obj,<br>
+bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,<br>
unsigned int flags)<br>
{<br>
struct clflush *clflush;<br>
@@ -128,7 +128,7 @@ void i915_gem_clflush_object(struct drm_i915_gem_object *obj,<br>
*/<br>
if (!i915_gem_object_has_struct_<wbr>page(obj)) {<br>
obj->cache_dirty = false;<br>
- return;<br>
+ return false;<br>
}<br>
<br>
/* If the GPU is snooping the contents of the CPU cache,<br>
@@ -140,7 +140,7 @@ void i915_gem_clflush_object(struct drm_i915_gem_object *obj,<br>
* tracking.<br>
*/<br>
if (!(flags & I915_CLFLUSH_FORCE) && obj->cache_coherent)<br>
- return;<br>
+ return false;<br>
<br>
trace_i915_gem_object_clflush(<wbr>obj);<br>
<br>
@@ -179,4 +179,5 @@ void i915_gem_clflush_object(struct drm_i915_gem_object *obj,<br>
}<br>
<br>
obj->cache_dirty = false;<br>
+ return true;<br>
}<br>
diff --git a/drivers/gpu/drm/i915/i915_<wbr>gem_clflush.h b/drivers/gpu/drm/i915/i915_<wbr>gem_clflush.h<br>
index 2455a7820937..f390247561b3 100644<br>
--- a/drivers/gpu/drm/i915/i915_<wbr>gem_clflush.h<br>
+++ b/drivers/gpu/drm/i915/i915_<wbr>gem_clflush.h<br>
@@ -28,7 +28,7 @@<br>
struct drm_i915_private;<br>
struct drm_i915_gem_object;<br>
<br>
-void i915_gem_clflush_object(struct drm_i915_gem_object *obj,<br>
+bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,<br>
unsigned int flags);<br>
#define I915_CLFLUSH_FORCE BIT(0)<br>
#define I915_CLFLUSH_SYNC BIT(1)<br>
diff --git a/drivers/gpu/drm/i915/i915_<wbr>gem_execbuffer.c b/drivers/gpu/drm/i915/i915_<wbr>gem_execbuffer.c<br>
index 2c76d6980855..aeacfe9a0878 100644<br>
--- a/drivers/gpu/drm/i915/i915_<wbr>gem_execbuffer.c<br>
+++ b/drivers/gpu/drm/i915/i915_<wbr>gem_execbuffer.c<br>
@@ -1826,7 +1826,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)<br>
int err;<br>
<br>
for (i = 0; i < count; i++) {<br>
- const struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];<br>
+ struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];<br>
struct i915_vma *vma = exec_to_vma(entry);<br>
struct drm_i915_gem_object *obj = vma->obj;<br>
<br>
@@ -1842,12 +1842,14 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)<br>
eb->request->capture_list = capture;<br>
}<br>
<br>
+ if (unlikely(obj->cache_dirty && !obj->cache_coherent)) {<br>
+ if (i915_gem_clflush_object(obj, 0))<br>
+ entry->flags &= ~EXEC_OBJECT_ASYNC;<br>
+ }<br>
+<br>
if (entry->flags & EXEC_OBJECT_ASYNC)<br>
goto skip_flushes;<br>
<br>
- if (unlikely(obj->cache_dirty && !obj->cache_coherent))<br>
- i915_gem_clflush_object(obj, 0);<br>
-<br>
err = i915_gem_request_await_object<br>
(eb->request, obj, entry->flags & EXEC_OBJECT_WRITE);<br>
if (err)<br>
<span class="HOEnZb"><font color="#888888">--<br>
2.13.2<br>
<br>
</font></span></blockquote></div><br></div></div>