[Mesa-dev] [PATCH] anv/query: Busy-wait for available query entries
Iago Toral
itoral at igalia.com
Thu Apr 6 06:24:09 UTC 2017
On Wed, 2017-04-05 at 09:31 -0700, Jason Ekstrand wrote:
> On Wed, Apr 5, 2017 at 12:24 AM, Iago Toral <itoral at igalia.com>
> wrote:
> > On Tue, 2017-04-04 at 19:21 -0700, Jason Ekstrand wrote:
> > > Before, we were just looking at whether or not the user wanted us
> > to
> > > wait and waiting on the BO. This instead makes us busy-loop on
> > each
> > > query until it's available. This reduces some of the pipeline
> > > bubbles
> > > we were getting and improves performance of The Talos Principle
> > on
> > > medium settings (where the GPU isn't overloaded) by around 20% on
> > my
> > > SkyLake gt4.
> >
> > Do you think busy looping is a good strategy in general? In
> > scenarios
> > where the GPU is busier, isn't this expected to have a negative
> > effect
> > on performance?
> The reason why this gets better performance is not because it's a
> busy wait rather than asking the kernel to wait. It's because it
> lets us wait on a single query rather than on the whole buffer.
> Talos packs hundreds of queries into a single pool and the uses of
> that pool may be split across multiple command buffers. Waiting for
> the whole BO causes us to basically wait for the GPU to be idle in
> order to get the one query result we want.
Ok, that makes a lot more sense :)
> >
> > > ---
> >
> > > src/intel/vulkan/genX_query.c | 72
> >
> > > ++++++++++++++++++++++++++++++> > +++++++++----
> >
> > > 1 file changed, 66 insertions(+), 6 deletions(-)
> >
> > >
> >
> > > diff --git a/src/intel/vulkan/genX_query.> > c
> >
> > > b/src/intel/vulkan/genX_query.> > c
> >
> > > index 7ea9404..ebf99d2 100644
> >
> > > --- a/src/intel/vulkan/genX_query.> > c
> >
> > > +++ b/src/intel/vulkan/genX_query.> > c
> >
> > > @@ -131,6 +131,64 @@ cpu_write_query_result(void *dst_slot,
> >
> > > VkQueryResultFlags flags,
> >
> > > }
> >
> > > }
> >
> > >
> >
> > > +#define NSEC_PER_SEC 1000000000
> >
> > > +
> >
> > > +static bool
> >
> > > +query_is_available(struct anv_device *device, uint64_t *slot)
> >
> > > +{
> >
> > > + if (!device->info.has_llc)
> >
> > > + __builtin_ia32_clflush(> > slot);
> >
> > > +
> >
> > > + return slot[0];
> >
> > > +}
> >
> > > +
> >
> > > +static VkResult
> >
> > > +wait_for_available(struct anv_device *device,
> >
> > > + struct anv_query_pool *pool, uint64_t *slot)
> >
> > > +{
> >
> > > + while (true) {
> >
> > > + struct timespec start;
> >
> > > + clock_gettime(CLOCK_> > MONOTONIC, &start);
> >
> > > +
> >
> > > + while (true) {
> >
> > > + if (!device->info.has_llc)
> >
> > > + __builtin_ia32_> > clflush(slot);
> >
> > > +
> >
> > > + if (query_is_available(device, slot))
> >
> > > + return VK_SUCCESS;
> >
> > > +
> >
> > > + struct timespec current;
> >
> > > + clock_gettime(CLOCK_> > MONOTONIC, ¤t);
> >
> > > +
> >
> > > + if (current.tv_nsec < start.tv_nsec) {
> >
> > > + current.tv_nsec += NSEC_PER_SEC;
> >
> > > + current.tv_sec -= 1;
> >
> > > + }
> >
> > > +
> >
> > > + /* If we've been looping for more than 1 ms, break out of
> >
> > > the busy
> >
> > > + * loop and ask the kernel if the buffer is actually busy.
> >
> > > + */
> >
> > > + if (current.tv_sec > start.tv_sec ||
> >
> >
> > This should be && instead of ||, right?
> > No. Thanks to the subtraction above, if current.tv_sec > start.tv_sec, then it's been at least one whole second.
Right, apparently I can't read properly any more...
> >
> > > + current.tv_nsec - start.tv_nsec > 1000000)
> >
> > > + break;
> >
> > > + }
> >
> > > +
> >
> > > + VkResult result = anv_device_wait(device, &pool->bo, 0);
> >
> > > + switch (result) {
> >
> > > + case VK_SUCCESS:
> >
> > > + /* The BO is no longer busy. If we haven't seen
> >
> > > availability yet,
> >
> > > + * then we never will.
> >
> > > + */
> >
> > > + return query_is_available(device, slot) ? VK_SUCCESS :
> >
> > > VK_NOT_READY;
> >
> >
> > By the comment above it sounds like returning VK_NOT_READY isn't what
> >
> > we want here. If we don't expect the query to ever be available at this
> >
> > point, returning VK_NOT_READY will only prompt the application to call
> >
> > us again... shouldn't we return an error in this case?
> > This case shouldn't happen. If the client is using the API correctly, all of the query results it's trying to get should already have Begin/End submitted to the GPU. However, if the client does something bogus, I don't want to land them in an infinite busy-loop. Returning VK_NOT_READY seemed reasonable, but we could make it some error or other.
If this can only happen in the case of incorrect API usage I guess it
doesn't really matter much.
Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
> >
> > > + case VK_TIMEOUT:
> >
> > > + /* The BO is still busy, keep waiting. */
> >
> > > + continue;
> >
> > > + default:
> >
> > > + return result;
> >
> > > + }
> >
> > > + }
> >
> > > +}
> >
> > > +
> >
> > > VkResult genX(GetQueryPoolResults)(
> >
> > > VkDevice > > _device,
> >
> > > VkQueryPool > > queryPool,
> >
> > > @@ -154,12 +212,6 @@ VkResult genX(GetQueryPoolResults)(
> >
> > > if (pData == NULL)
> >
> > > return VK_SUCCESS;
> >
> > >
> >
> > > - if (flags & VK_QUERY_RESULT_WAIT_BIT) {
> >
> > > - VkResult result = anv_device_wait(device, &pool->bo,
> >
> > > INT64_MAX);
> >
> > > - if (result != VK_SUCCESS)
> >
> > > - return result;
> >
> > > - }
> >
> > > -
> >
> > > void *data_end = pData + dataSize;
> >
> > >
> >
> > > if (!device->info.has_llc) {
> >
> > > @@ -176,6 +228,14 @@ VkResult genX(GetQueryPoolResults)(
> >
> > > /* Availability is always at the start of the slot */
> >
> > > bool available = slot[0];
> >
> > >
> >
> > > + if (!available && (flags & VK_QUERY_RESULT_WAIT_BIT)) {
> >
> > > + status = wait_for_available(device, pool, slot);
> >
> > > + if (status != VK_SUCCESS)
> >
> > > + return status;
> >
> > > +
> >
> > > + available = true;
> >
> > > + }
> >
> > > +
> >
> > > /* From the Vulkan 1.0.42 spec:
> >
> > > *
> >
> > > * "If VK_QUERY_RESULT_WAIT_BIT and
> >
> > > VK_QUERY_RESULT_PARTIAL_BIT are
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170406/e725774f/attachment.html>
More information about the mesa-dev
mailing list