<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, May 9, 2017 at 4:49 PM, Chad Versace <span dir="ltr"><<a href="mailto:chadversary@chromium.org" target="_blank">chadversary@chromium.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I learned a lot by reviewing this patch. Before reviewing it, some parts<br>
of the external_semaphore spec still confused me. But reviewing this<br>
forced me to really learn the behavior of external semaphores.<br>
<br>
Other than some small security nits, the only real issue I found was<br>
the error behavior of vkQueueSubmit. Maybe I'm wrong and your error<br>
behavior is fine, though. Let's discuss.<br>
<br>
A little complaint against the current spec... The patch was hard to<br>
review because the 1.0.49 has incorrect language regarding external<br>
semaphores. I had to go review the unreleased spec to get to the truth.<br>
Anyway, on with review...<br>
<span class=""><br>
On Fri 14 Apr 2017, Jason Ekstrand wrote:<br>
</span><div><div class="h5">> ---<br>
> src/intel/vulkan/anv_batch_<wbr>chain.c | 96 ++++++++++++++++++++++++++++++<wbr>++++++--<br>
> src/intel/vulkan/anv_device.c | 25 ++++++++++<br>
> src/intel/vulkan/anv_gem.c | 36 ++++++++++++++<br>
> src/intel/vulkan/anv_private.h | 24 +++++++---<br>
> src/intel/vulkan/anv_queue.c | 73 +++++++++++++++++++++++++++--<br>
> 5 files changed, 240 insertions(+), 14 deletions(-)<br>
<br>
> diff --git a/src/intel/vulkan/anv_batch_<wbr>chain.c b/src/intel/vulkan/anv_batch_<wbr>chain.c<br>
> index 0529f22..ec37c81 100644<br>
> --- a/src/intel/vulkan/anv_batch_<wbr>chain.c<br>
> +++ b/src/intel/vulkan/anv_batch_<wbr>chain.c<br>
> @@ -1387,6 +1387,23 @@ setup_execbuf_for_cmd_buffer(<wbr>struct anv_execbuf *execbuf,<br>
> return VK_SUCCESS;<br>
> }<br>
><br>
> +static void<br>
> +setup_empty_execbuf(struct anv_execbuf *execbuf, struct anv_device *device)<br>
> +{<br>
> + anv_execbuf_add_bo(execbuf, &device->trivial_batch_bo, NULL, 0,<br>
> + &device->alloc);<br>
> +<br>
> + execbuf->execbuf = (struct drm_i915_gem_execbuffer2) {<br>
> + .buffers_ptr = (uintptr_t) execbuf->objects,<br>
> + .buffer_count = execbuf->bo_count,<br>
> + .batch_start_offset = 0,<br>
> + .batch_len = 8, /* GEN8_MI_BATCH_BUFFER_END and NOOP */<br>
<br>
</div></div>Instead of hard-coding a magic number here, I think<br>
.batch_len = device->trivial_batch_bo.next - device->trivial_batch_bo.<wbr>start,<br>
is better. With that, the comment never becomes stale.<br>
<div><div class="h5"><br>
> + .flags = I915_EXEC_HANDLE_LUT | I915_EXEC_RENDER,<br>
> + .rsvd1 = device->context_id,<br>
> + .rsvd2 = 0,<br>
> + };<br>
> +}<br>
> +<br>
> VkResult<br>
> anv_cmd_buffer_execbuf(struct anv_device *device,<br>
> struct anv_cmd_buffer *cmd_buffer,<br>
> @@ -1398,11 +1415,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device,<br>
> struct anv_execbuf execbuf;<br>
> anv_execbuf_init(&execbuf);<br>
><br>
> + int in_fence = -1;<br>
> VkResult result = VK_SUCCESS;<br>
> for (uint32_t i = 0; i < num_in_semaphores; i++) {<br>
> ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]);<br>
> - assert(semaphore->temporary.<wbr>type == ANV_SEMAPHORE_TYPE_NONE);<br>
> - struct anv_semaphore_impl *impl = &semaphore->permanent;<br>
> + struct anv_semaphore_impl *impl =<br>
> + semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE ?<br>
> + &semaphore->temporary : &semaphore->permanent;<br>
><br>
> switch (impl->type) {<br>
> case ANV_SEMAPHORE_TYPE_BO:<br>
> @@ -1411,13 +1430,42 @@ anv_cmd_buffer_execbuf(struct anv_device *device,<br>
> if (result != VK_SUCCESS)<br>
> return result;<br>
> break;<br>
> +<br>
> + case ANV_SEMAPHORE_TYPE_SYNC_FILE:<br>
> + if (in_fence == -1) {<br>
> + in_fence = impl->fd;<br>
> + } else {<br>
> + int merge = anv_gem_sync_file_merge(<wbr>device, in_fence, impl->fd);<br>
> + if (merge == -1)<br>
> + return vk_error(VK_ERROR_INVALID_<wbr>EXTERNAL_HANDLE_KHX);<br>
<br>
</div></div>Because we immediately close the semaphore's fd, the spec does not allow<br>
us to return VK_ERROR_INVALID_EXTERNAL_<wbr>HANDLE_KHX here. We must either<br>
defer closing the fd until immediately before VK_SUCCESS, or return<br>
VK_ERROR_DEVICE_LOST.<br>
<br>
>From the 1.0.49 spec:<br>
<br>
If vkQueueSubmit fails, it may return VK_ERROR_OUT_OF_HOST_MEMORY or<br>
VK_ERROR_OUT_OF_DEVICE_MEMORY. If it does, the implementation must<br>
ensure that the state and contents of any resources or<br>
synchronization primitives referenced by the submitted command<br>
buffers and any semaphores referenced by pSubmits is unaffected by<br>
the call or its failure. If vkQueueSubmit fails in such a way that<br>
the implementation can not make that guarantee, the implementation<br>
must return VK_ERROR_DEVICE_LOST<br></blockquote><div><br></div><div>Yes, I wrote that text. :-) anv_QueueSubmit already turns all errors into VK_DEVICE_LOST so we can return as descriptive an error as we want here. Really, we should probably use the sync_file_info ioctl on import to see if it's a sync file and fail then.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div class="h5"><br>
> +<br>
> + close(impl->fd);<br>
> + close(in_fence);<br>
> + in_fence = merge;<br>
> + }<br>
> +<br>
> + impl->fd = -1;<br>
> + break;<br>
> +<br>
> default:<br>
> break;<br>
> }<br>
> +<br>
> + /* Waiting on a semaphore with temporary state implicitly resets it back<br>
> + * to the permanent state.<br>
> + */<br>
> + if (semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE) {<br>
> + assert(semaphore->temporary.<wbr>type == ANV_SEMAPHORE_TYPE_SYNC_FILE);<br>
> + semaphore->temporary.type = ANV_SEMAPHORE_TYPE_NONE;<br>
> + }<br>
> }<br>
><br>
> + bool need_out_fence = false;<br>
> for (uint32_t i = 0; i < num_out_semaphores; i++) {<br>
> ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);<br>
> + /* Out fences can't have temporary state because that would imply<br>
> + * that we imported a sync file and are trying to signal it.<br>
> + */<br>
> assert(semaphore->temporary.<wbr>type == ANV_SEMAPHORE_TYPE_NONE);<br>
> struct anv_semaphore_impl *impl = &semaphore->permanent;<br>
><br>
> @@ -1428,17 +1476,55 @@ anv_cmd_buffer_execbuf(struct anv_device *device,<br>
> if (result != VK_SUCCESS)<br>
> return result;<br>
> break;<br>
> +<br>
> + case ANV_SEMAPHORE_TYPE_SYNC_FILE:<br>
> + need_out_fence = true;<br>
> + break;<br>
> +<br>
> default:<br>
> break;<br>
> }<br>
> }<br>
><br>
> - result = setup_execbuf_for_cmd_buffer(&<wbr>execbuf, cmd_buffer);<br>
> - if (result != VK_SUCCESS)<br>
> - return result;<br>
> + if (cmd_buffer) {<br>
> + result = setup_execbuf_for_cmd_buffer(&<wbr>execbuf, cmd_buffer);<br>
> + if (result != VK_SUCCESS)<br>
> + return result;<br>
> + } else {<br>
> + setup_empty_execbuf(&execbuf, device);<br>
> + }<br>
<br>
</div></div>In the if-tree above, the patch has the same problem with error behavior<br>
as above. We need to defer updating the semaphores until VK_SUCCESS or<br>
return VK_ERROR_DEVICE_LOST.<span class=""><br></span></blockquote><div><br></div><div>See above comment.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
> +<br>
> + if (in_fence != -1) {<br>
> + execbuf.execbuf.flags |= I915_EXEC_FENCE_IN;<br>
> + execbuf.execbuf.rsvd2 |= (uint32_t)in_fence;<br>
> + }<br>
> +<br>
> + if (need_out_fence)<br>
> + execbuf.execbuf.flags |= I915_EXEC_FENCE_OUT;<br>
><br>
> result = anv_device_execbuf(device, &execbuf.execbuf, execbuf.bos);<br>
><br>
> + /* Execbuf does not consume the in_fence. It's our job to close it. */<br>
<br>
</span>Weird. I didn't know that. I expected execbuf to transfer sync_file<br>
ownership.<span class=""><br></span></blockquote><div><br></div><div>So did I at first.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
> + close(in_fence);<br>
> +<br>
> + if (result == VK_SUCCESS && need_out_fence) {<br>
> + int out_fence = execbuf.execbuf.rsvd2 >> 32;<br>
> + for (uint32_t i = 0; i < num_out_semaphores; i++) {<br>
> + ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);<br>
> + /* Out fences can't have temporary state because that would imply<br>
> + * that we imported a sync file and are trying to signal it.<br>
> + */<br>
> + assert(semaphore->temporary.<wbr>type == ANV_SEMAPHORE_TYPE_NONE);<br>
> + struct anv_semaphore_impl *impl = &semaphore->permanent;<br>
> +<br>
> + if (impl->type == ANV_SEMAPHORE_TYPE_SYNC_FILE) {<br>
> + assert(impl->fd == -1);<br>
> + impl->fd = dup(out_fence);<br>
> + }<br>
> + }<br>
<br>
</span>This loop looks right to me.<br>
<span class=""><br>
> + close(out_fence);<br>
> + }<br>
> +<br>
> anv_execbuf_finish(&execbuf, &device->alloc);<br>
><br>
> return result;<br>
> diff --git a/src/intel/vulkan/anv_device.<wbr>c b/src/intel/vulkan/anv_device.<wbr>c<br>
> index f6e77ab..2885bb6 100644<br>
> --- a/src/intel/vulkan/anv_device.<wbr>c<br>
> +++ b/src/intel/vulkan/anv_device.<wbr>c<br>
<br>
</span>[snip]<br>
<span class=""><br>
> +static void<br>
> +anv_device_init_trivial_<wbr>batch(struct anv_device *device)<br>
> +{<br>
> + anv_bo_init_new(&device-><wbr>trivial_batch_bo, device, 4096);<br>
> + void *map = anv_gem_mmap(device, device->trivial_batch_bo.gem_<wbr>handle,<br>
> + 0, 4096, 0);<br>
> +<br>
> + struct anv_batch batch;<br>
> + batch.start = batch.next = map;<br>
> + batch.end = map + 4096;<br>
<br>
</span>On a release build, the above leaves much of the batch initialized to<br>
undefined values; that includes the batch->alloc callback. It should<br>
instead use a designated initializer list.<span class=""><br></span></blockquote><div><br></div><div>Probably a good idea.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
> +<br>
> + anv_batch_emit(&batch, GEN7_MI_BATCH_BUFFER_END, bbe);<br>
> + anv_batch_emit(&batch, GEN7_MI_NOOP, noop);<br>
> +<br>
> + if (!device->info.has_llc)<br>
> + anv_clflush_range(map, batch.next - map);<br>
> +<br>
> + anv_gem_munmap(map, device->trivial_batch_bo.size)<wbr>;<br>
> +}<br>
<br>
</span>Other than that, anv_device_init_trivial_batch looks good.<br>
<br>
[snip]<br>
<span class=""><br>
> diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c<br>
> index 1392bf4..ffdc5a1 100644<br>
> --- a/src/intel/vulkan/anv_gem.c<br>
> +++ b/src/intel/vulkan/anv_gem.c<br>
<br>
</span>[snip]<br>
<span class=""><br>
> @@ -400,3 +401,38 @@ anv_gem_fd_to_handle(struct anv_device *device, int fd)<br>
><br>
> return args.handle;<br>
> }<br>
> +<br>
> +#ifndef SYNC_IOC_MAGIC<br>
> +/* duplicated from linux/sync_file.h to avoid build-time depnedency<br>
> + * on new (v4.7) kernel headers. Once distro's are mostly using<br>
> + * something newer than v4.7 drop this and #include <linux/sync_file.h><br>
> + * instead.<br>
> + */<br>
> +struct sync_merge_data {<br>
> + char name[32];<br>
> + __s32 fd2;<br>
> + __s32 fence;<br>
> + __u32 flags;<br>
> + __u32 pad;<br>
> +};<br>
<br>
</span>This will break Android because it defines struct sync_merge_data<br>
differently. I don't want the patch blocked on that, though. I'll handle<br>
fixing Android later, after this patch lands.<br></blockquote><div><br></div><div>Now that we've pulled in the i915 headers, I can probably just pull in the sync_file header and drop this hunk entirely. Still doesn't solve the android problem though...<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Out of caution, though, please make<br>
vkGetPhysicalDeviceExternalSem<wbr>aphorePropertiesKHX report that FENCE_FD<br>
is unsupported on Android with an `#ifndef ANDROID`.<span class=""><br></span></blockquote><div><br></div><div>I can do that if you'd like.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
> +<br>
> +#define SYNC_IOC_MAGIC '>'<br>
> +#define SYNC_IOC_MERGE _IOWR(SYNC_IOC_MAGIC, 3, struct sync_merge_data)<br>
> +#endif<br>
> +<br>
> +int<br>
> +anv_gem_sync_file_merge(<wbr>struct anv_device *device, int fd1, int fd2)<br>
> +{<br>
> + const char name[] = "anv merge fence";<br>
> + struct sync_merge_data args = {<br>
> + .fd2 = fd2,<br>
> + .fence = -1,<br>
> + };<br>
<br>
</span>For security future-proofing, I'd like to see here:<br>
<br>
STATIC_ASSERT(sizeof(<a href="http://args.name" rel="noreferrer" target="_blank">args.name</a><wbr>) >= sizeof(name);<span class=""><br></span></blockquote><div><br></div><div>Not a bad idea.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
> + memcpy(<a href="http://args.name" rel="noreferrer" target="_blank">args.name</a>, name, sizeof(name));<br>
> +<br>
> + int ret = anv_ioctl(fd1, SYNC_IOC_MERGE, &args);<br>
> + if (ret == -1)<br>
> + return -1;<br>
> +<br>
> + return args.fence;<br>
> +}<br>
<br>
</span>[snip]<br>
</blockquote></div><br></div></div>