[Mesa-dev] [PATCH] anv/query: Use snooping on !LLC platforms
Chris Wilson
chris at chris-wilson.co.uk
Fri Apr 7 08:31:35 UTC 2017
On Thu, Apr 06, 2017 at 08:49:54PM -0700, Jason Ekstrand wrote:
> Commit b2c97bc789198427043cd902bc76e194e7e81c7d which made us start
> using a busy-wait for individual query results also messed up cache
> flushing on !LLC platforms. For one thing, I forgot the mfence after
> the clflush so memory access wasn't properly getting fenced.
Ah, I thought it was confidence in the cacheline access from the same
cpu would be forced to wait. It wasn't completely broken as the second
clflush to the same cacheline is a full barrier, and there was always a
second blocking clflush prior to the final check. :)
Besides, mfence is not as strong a barrier of clflush vs gpu as the cpu
SDM imply (affects byt, bsw, later atoms unknown but since they share
other memory characteristics problably also affected). In the kernel I
resorted to doing a for_each_cacheline { clflush } clflush; mfence to
ensure the invalidate was successful.
> More
> importantly, however, was that we were clflushing the whole query range
> and then waiting for individual queries and then trying to read the
> results without clflushing again. Getting the clflushing both correct
> and efficient is very subtle and painful. Instead, let's side-step the
> problem by just snooping.
> ---
> src/intel/vulkan/genX_query.c | 24 +++++++++++-------------
> 1 file changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/src/intel/vulkan/genX_query.c b/src/intel/vulkan/genX_query.c
> index 235da8b..2c70b4f 100644
> --- a/src/intel/vulkan/genX_query.c
> +++ b/src/intel/vulkan/genX_query.c
> @@ -90,6 +90,14 @@ VkResult genX(CreateQueryPool)(
> if (result != VK_SUCCESS)
> goto fail;
>
> + /* For query pools, we set the caching mode to I915_CACHING_CACHED. On LLC
> + * platforms, this does nothing. On non-LLC platforms, this means snooping
> + * which comes at a slight cost. However, the buffers aren't big, won't be
> + * written frequently, and trying to handle the flushing manually without
> + * doing too much flushing is extremely painful.
> + */
> + anv_gem_set_caching(device, pool->bo.gem_handle, I915_CACHING_CACHED);
> +
> pool->bo.map = anv_gem_mmap(device, pool->bo.gem_handle, 0, size, 0);
>
> *pQueryPool = anv_query_pool_to_handle(pool);
> @@ -132,11 +140,8 @@ cpu_write_query_result(void *dst_slot, VkQueryResultFlags flags,
> }
>
> static bool
> -query_is_available(struct anv_device *device, uint64_t *slot)
> +query_is_available(uint64_t *slot)
> {
> - if (!device->info.has_llc)
> - __builtin_ia32_clflush(slot);
> -
> return *(volatile uint64_t *)slot;
> }
>
> @@ -145,7 +150,7 @@ wait_for_available(struct anv_device *device,
> struct anv_query_pool *pool, uint64_t *slot)
> {
> while (true) {
> - if (query_is_available(device, slot))
> + if (query_is_available(slot))
> return VK_SUCCESS;
>
> int ret = anv_gem_busy(device, pool->bo.gem_handle);
> @@ -159,7 +164,7 @@ wait_for_available(struct anv_device *device,
> } else {
> assert(ret == 0);
> /* The BO is no longer busy. */
> - if (query_is_available(device, slot)) {
> + if (query_is_available(slot)) {
> return VK_SUCCESS;
> } else {
> VkResult status = anv_device_query_status(device);
> @@ -204,13 +209,6 @@ VkResult genX(GetQueryPoolResults)(
>
> void *data_end = pData + dataSize;
>
> - if (!device->info.has_llc) {
> - uint64_t offset = firstQuery * pool->stride;
> - uint64_t size = queryCount * pool->stride;
> - anv_invalidate_range(pool->bo.map + offset,
> - MIN2(size, pool->bo.size - offset));
> - }
> -
> VkResult status = VK_SUCCESS;
> for (uint32_t i = 0; i < queryCount; i++) {
> uint64_t *slot = pool->bo.map + (firstQuery + i) * pool->stride;
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the mesa-dev
mailing list