[Mesa-dev] [PATCH v3 2/2] anv/query: Busy-wait for available query entries

Chris Wilson chris at chris-wilson.co.uk
Wed Apr 5 17:45:28 UTC 2017


On Wed, Apr 05, 2017 at 10:28:53AM -0700, Jason Ekstrand wrote:
> Before, we were just looking at whether or not the user wanted us to
> wait and waiting on the BO.  Some clients, such as the Serious engine,
> use a single query pool for hundreds of individual query results where
> the writes for those queries may be split across several command
> buffers.  In this scenario, the individual query we're looking for may
> become available long before the BO is idle so waiting on the query pool
> BO to be finished is wasteful. This commit makes us instead busy-loop on
> each query until it's available.
> 
> This significantly reduces pipeline bubbles and improves performance of
> The Talos Principle on medium settings (where the GPU isn't overloaded
> with drawing) by around 20% on my SkyLake gt4.
> ---
>  src/intel/vulkan/genX_query.c | 52 ++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 46 insertions(+), 6 deletions(-)
> 
> diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
> index 7ea9404..0d303a6 100644
> --- a/src/intel/vulkan/genX_query.c
> +++ b/src/intel/vulkan/genX_query.c
> @@ -131,6 +131,44 @@ cpu_write_query_result(void *dst_slot, VkQueryResultFlags flags,
>     }
>  }
>  
> +static bool
> +query_is_available(struct anv_device *device, volatile uint64_t *slot)
> +{
> +   if (!device->info.has_llc)
> +      __builtin_ia32_clflush(slot);
> +
> +   return slot[0];
> +}
> +
> +static VkResult
> +wait_for_available(struct anv_device *device,
> +                   struct anv_query_pool *pool, uint64_t *slot)
> +{
> +   while (true) {
> +      if (query_is_available(device, slot))
> +         return VK_SUCCESS;
> +
> +      VkResult result = anv_device_bo_busy(device, &pool->bo);

Ah, but you can use the simpler check here because you follow up with a
query_is_available() so you know whether or not the hang clobbered the
result.

If the query is not available but the bo is idle, you might then went to
check for a reset in case it was due to a lost device. GEM_BUSY is
lockless, but GEM_RESET_STATS currently takes the big struct_mutex and
so has non-deterministic and often quite large latencies.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the mesa-dev mailing list