<div dir="ltr">Tested-by: Alex Smith <<a href="mailto:asmith@feralinteractive.com">asmith@feralinteractive.com</a>><div><br></div><div>Thanks! s/Queryool/QueryPool/ in the subject, btw.</div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, 4 Dec 2018 at 15:52, Samuel Pitoiset <<a href="mailto:samuel.pitoiset@gmail.com">samuel.pitoiset@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Because WAIT_REG_MEM can only wait for a 32-bit value, it's not<br>
safe to use it for timestamp queries. If we only wait on the low<br>
32 bits of a timestamp query we could be unlucky and the GPU<br>
might hang.<br>
<br>
One possible fix is to emit a full end of pipe event and wait<br>
on a 32-bit value which is actually an availability bit. This<br>
bit is allocated at creation time and always cleared before<br>
emitting the EOP event.<br>
<br>
Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=108925" rel="noreferrer" target="_blank">https://bugs.freedesktop.org/show_bug.cgi?id=108925</a><br>
Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp queries")<br>
Signed-off-by: Samuel Pitoiset <<a href="mailto:samuel.pitoiset@gmail.com" target="_blank">samuel.pitoiset@gmail.com</a>><br>
---<br>
src/amd/vulkan/radv_query.c | 49 +++++++++++++++++++++++++++++++------<br>
1 file changed, 41 insertions(+), 8 deletions(-)<br>
<br>
diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c<br>
index 550abe307a1..9bb6b660add 100644<br>
--- a/src/amd/vulkan/radv_query.c<br>
+++ b/src/amd/vulkan/radv_query.c<br>
@@ -1056,8 +1056,15 @@ VkResult radv_CreateQueryPool(<br>
pool->pipeline_stats_mask = pCreateInfo->pipelineStatistics;<br>
pool->availability_offset = pool->stride * pCreateInfo->queryCount;<br>
pool->size = pool->availability_offset;<br>
- if (pCreateInfo->queryType == VK_QUERY_TYPE_PIPELINE_STATISTICS)<br>
+ if (pCreateInfo->queryType == VK_QUERY_TYPE_PIPELINE_STATISTICS) {<br>
pool->size += 4 * pCreateInfo->queryCount;<br>
+ } else if (pCreateInfo->queryType == VK_QUERY_TYPE_TIMESTAMP) {<br>
+ /* Allocate one DWORD for the availability bit which is needed<br>
+ * for vkCmdCopyQueryPoolResults() because we can't perform a<br>
+ * WAIT_REG_MEM on a 64-bit value.<br>
+ */<br>
+ pool->size += 4;<br>
+ }<br>
<br>
pool->bo = device->ws->buffer_create(device->ws, pool->size,<br>
64, RADEON_DOMAIN_GTT, RADEON_FLAG_NO_INTERPROCESS_SHARING);<br>
@@ -1328,19 +1335,45 @@ void radv_CmdCopyQueryPoolResults(<br>
pool->availability_offset + 4 * firstQuery);<br>
break;<br>
case VK_QUERY_TYPE_TIMESTAMP:<br>
+ if (flags & VK_QUERY_RESULT_WAIT_BIT) {<br>
+ /* Emit a full end of pipe event because we can't<br>
+ * perform a WAIT_REG_MEM on a 64-bit value. If we only<br>
+ * do a WAIT_REG_MEM on the low 32 bits of a timestamp<br>
+ * query we could be unlucky and the GPU might hang.<br>
+ */<br>
+ enum chip_class chip = cmd_buffer->device->physical_device->rad_info.chip_class;<br>
+ bool is_mec = radv_cmd_buffer_uses_mec(cmd_buffer);<br>
+ uint64_t avail_va = va + pool->availability_offset;<br>
+<br>
+ /* Clear the availability bit before waiting on the end<br>
+ * of pipe event.<br>
+ */<br>
+ radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));<br>
+ radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |<br>
+ S_370_WR_CONFIRM(1) |<br>
+ S_370_ENGINE_SEL(V_370_ME));<br>
+ radeon_emit(cs, avail_va);<br>
+ radeon_emit(cs, avail_va >> 32);<br>
+ radeon_emit(cs, 0xdeadbeef);<br>
+<br>
+ /* Wait for all prior GPU work. */<br>
+ si_cs_emit_write_event_eop(cs, chip, is_mec,<br>
+ V_028A90_BOTTOM_OF_PIPE_TS, 0,<br>
+ EOP_DATA_SEL_VALUE_32BIT,<br>
+ avail_va, 0, 1,<br>
+ cmd_buffer->gfx9_eop_bug_va);<br>
+<br>
+ /* Wait on the timestamp value. */<br>
+ radv_cp_wait_mem(cs, WAIT_REG_MEM_EQUAL, avail_va,<br>
+ 1, 0xffffffff);<br>
+ }<br>
+<br>
for(unsigned i = 0; i < queryCount; ++i, dest_va += stride) {<br>
unsigned query = firstQuery + i;<br>
uint64_t local_src_va = va + query * pool->stride;<br>
<br>
MAYBE_UNUSED unsigned cdw_max = radeon_check_space(cmd_buffer->device->ws, cs, 19);<br>
<br>
-<br>
- if (flags & VK_QUERY_RESULT_WAIT_BIT) {<br>
- radv_cp_wait_mem(cs, WAIT_REG_MEM_NOT_EQUAL,<br>
- local_src_va,<br>
- TIMESTAMP_NOT_READY >> 32,<br>
- 0xffffffff);<br>
- }<br>
if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT) {<br>
uint64_t avail_dest_va = dest_va + elem_size;<br>
<br>
-- <br>
2.19.2<br>
<br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</blockquote></div>