[Mesa-dev] [PATCH 5/5] i965/query: Cache whether the batch references the query BO.

Eero Tamminen eero.t.tamminen at intel.com
Mon Dec 15 07:38:23 PST 2014


Hi,

On 12/13/2014 09:15 AM, Kenneth Graunke wrote:
> Chris Wilson noted that repeated calls to CheckQuery() would call
> drm_intel_bo_references(brw->batch.bo, query->bo) on each invocation,
> which is expensive.  Once we've flushed, we know that future batches
> won't reference query->bo, so there's no point in asking more than once.
>
> This patch adds a brw_query_object::flushed flag, which is a
> conservative estimate of whether the batch has been flushed.
>
> On the first call to CheckQuery() or WaitQuery(), we check if the
> batch references query->bo.  If not, it must have been flushed for
> some reason (such as being full).  We record that it was flushed.
> If it does reference query->bo, we explicitly flush, and record that
> we did so.
>
> Any subsequent checks will simply see that query->flushed is set,
> and skip the drm_intel_bo_references() call.
>
> Inspired by a patch from Chris Wilson.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Eero Tamminen <eero.t.tamminen at intel.com>
> ---
>   src/mesa/drivers/dri/i965/brw_context.h   |  3 +++
>   src/mesa/drivers/dri/i965/gen6_queryobj.c | 27 +++++++++++++++++++++++----
>   2 files changed, 26 insertions(+), 4 deletions(-)
>
> ickle's patch in bug #86969 eliminated all calls to drm_intel_bo_references
> in the query object code.  One consequence is that the first call to
> CheckQuery or WaitQuery would always flush the batch, even if it wasn't
> necessary (say, the batch had been flushed).
>
> Mine is a bit different - it still uses bo_references on the first call to
> CheckQuery/WaitQuery - but then records that the batch has been flushed, so
> we never have to call it again.
>
> Either approach should greatly reduce the overhead from repeatedly calling
> drm_intel_bo_references.
>
> Eero, would you be able to test this and see if it solves the issue you
> were seeing?  If so, could you provide some simple statistics showing the
> benefit?  Thanks!

Similarly to Chris' patch:

* This patch-series doesn't change the performance on my HSW GT3e test
   machine (game is GPU mem BW bound), but it reduces total CPU load
   clearly by halving user-space CPU usage.

* iowait (assumably for GPU) increase from few percents to ~1/4th,
   assumably because CPU is now used less i.e. GPU side optimizations
   should now be better visible.

* FPS fluctuation increased from 0.3 FPS to 0.8 FPS,
   while still keeping the same long term average FPS.

I looked more into the fluctuations.   Exactly every other frame was
long and every other a short one in both cases [1].  With smaller CPU
utilization, short frames were shorter and long ones longer, but
additionally, frame times had spread out more.

With the larger CPU utilization, CPU on which Witcher2 was currently
running, was always running at full frequency.  With smaller CPU
utilization, core didn't run at full speed the whole frame.  I think
latencies from that explain the larger frame time fluctuations and
therefore they're not an issue.


In summary:
- both your and Chris' patch reduce CPU consumption without affecting
   performance otherwise.  I.e. they do what is expected, reduce power
   usage and could provide better performance in more temperature
   limited machines (e.g. laptops).


	- Eero

[1] I've seen this pattern in so many different test-cases, that
     I don't think it to be related to what the game engine is
     doing (e.g. extra rendering on every other frame).

> This is available as the 'query-references' branch in my tree (~kwg/mesa).
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> index 1b8f0bb..a63c483 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -831,6 +831,9 @@ struct brw_query_object {
>
>      /** Last index in bo with query data for this object. */
>      int last_index;
> +
> +   /** True if we know the batch has been flushed since we ended the query. */
> +   bool flushed;
>   };
>
>   struct intel_sync_object {
> diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c
> index 0d51390..de71bb5 100644
> --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c
> +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c
> @@ -281,6 +281,27 @@ gen6_end_query(struct gl_context *ctx, struct gl_query_object *q)
>      default:
>         unreachable("Unrecognized query target in brw_end_query()");
>      }
> +
> +   /* The current batch contains the commands to handle EndQuery(),
> +    * but they won't actually execute until it is flushed.
> +    */
> +   query->flushed = false;
> +}
> +
> +/**
> + * Flush the batch if it still references the query object BO.
> + */
> +static void
> +flush_batch_if_needed(struct brw_context *brw, struct brw_query_object *query)
> +{
> +   /* If the batch doesn't reference the BO, it must have been flushed
> +    * (for example, due to being full).  Record that it's been flushed.
> +    */
> +   query->flushed = query->flushed ||
> +      !drm_intel_bo_references(brw->batch.bo, query->bo);
> +
> +   if (!query->flushed)
> +      intel_batchbuffer_flush(brw);
>   }
>
>   /**
> @@ -298,8 +319,7 @@ static void gen6_wait_query(struct gl_context *ctx, struct gl_query_object *q)
>       * still contributing to it, flush it now to finish that work so the
>       * result will become available (eventually).
>       */
> -   if (drm_intel_bo_references(brw->batch.bo, query->bo))
> -      intel_batchbuffer_flush(brw);
> +   flush_batch_if_needed(brw, query);
>
>      gen6_queryobj_get_results(ctx, query);
>   }
> @@ -328,8 +348,7 @@ static void gen6_check_query(struct gl_context *ctx, struct gl_query_object *q)
>       *      not ready yet on the first time it is queried.  This ensures that
>       *      the async query will return true in finite time.
>       */
> -   if (drm_intel_bo_references(brw->batch.bo, query->bo))
> -      intel_batchbuffer_flush(brw);
> +   flush_batch_if_needed(brw, query);
>
>      if (!drm_intel_bo_busy(query->bo)) {
>         gen6_queryobj_get_results(ctx, query);
>



More information about the mesa-dev mailing list