[Mesa-dev] [PATCH 23/27] i965: Start and stop OA counters as necessary.

Daniel Vetter daniel at ffwll.ch
Thu Nov 14 05:33:18 PST 2013


On Wed, Nov 13, 2013 at 09:05:48PM -0800, Kenneth Graunke wrote:
> We need to start OA at the beginning of each batch where monitors are
> active.  OACONTROL isn't part of the hardware context, so to avoid
> leaving counters enabled for other applications, we turn them off at the
> end of the batch too.
> 
> We also need to start them at BeginPerfMonitor time (unless they've
> already been started).  We stop them when the monitor last ends as well.
> 
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> Cc: Eric Anholt <eric at anholt.net>
> Cc: Carl Worth <cworth at cworth.org>
> Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila at intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_context.h            |  2 +
>  .../drivers/dri/i965/brw_performance_monitor.c     | 50 ++++++++++++++++++++++
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c      |  7 +++
>  src/mesa/drivers/dri/i965/intel_batchbuffer.h      |  3 +-
>  4 files changed, 61 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> index 77649cd..f395297 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1627,6 +1627,8 @@ bool brw_render_target_supported(struct brw_context *brw,
>  /* brw_performance_monitor.c */
>  void brw_init_performance_monitors(struct brw_context *brw);
>  void brw_dump_perf_monitors(struct brw_context *brw);
> +void brw_perf_monitor_new_batch(struct brw_context *brw);
> +void brw_perf_monitor_finish_batch(struct brw_context *brw);
>  
>  /* intel_extensions.c */
>  extern void intelInitExtensions(struct gl_context *ctx);
> diff --git a/src/mesa/drivers/dri/i965/brw_performance_monitor.c b/src/mesa/drivers/dri/i965/brw_performance_monitor.c
> index 85c395b..707deb2 100644
> --- a/src/mesa/drivers/dri/i965/brw_performance_monitor.c
> +++ b/src/mesa/drivers/dri/i965/brw_performance_monitor.c
> @@ -766,6 +766,14 @@ brw_begin_perf_monitor(struct gl_context *ctx,
>        drm_intel_bo_unmap(monitor->oa_bo);
>  #endif
>  
> +      /* If the OA counters aren't already on, enable them. */
> +      if (brw->perfmon.oa_users == 0) {
> +         /* Ensure the OACONTROL enable and snapshot land in the same batch. */
> +         int space = (MI_REPORT_PERF_COUNT_BATCH_DWORDS + 3) * 4;
> +         intel_batchbuffer_require_space(brw, space, RENDER_RING);
> +         start_oa_counters(brw);
> +      }
> +
>        /* Take a starting OA counter snapshot. */
>        emit_mi_report_perf_count(brw, monitor->oa_bo, 0, REPORT_ID);
>  
> @@ -801,6 +809,9 @@ brw_end_perf_monitor(struct gl_context *ctx,
>                                  SECOND_SNAPSHOT_OFFSET_IN_BYTES, REPORT_ID);
>  
>        --brw->perfmon.oa_users;
> +
> +      if (brw->perfmon.oa_users == 0)
> +         stop_oa_counters(brw);
>     }
>  
>     if (monitor_needs_statistics_registers(brw, m)) {
> @@ -925,6 +936,45 @@ brw_delete_perf_monitor(struct gl_context *ctx, struct gl_perf_monitor_object *m
>  
>  /******************************************************************************/
>  
> +/**
> + * Called at the start of every render ring batch.
> + *
> + * Enable the OA counters if required.
> + */
> +void
> +brw_perf_monitor_new_batch(struct brw_context *brw)
> +{
> +   assert(brw->batch.ring == RENDER_RING);
> +   assert(brw->gen < 6 || brw->batch.used == 0);
> +
> +   if (brw->perfmon.oa_users == 0)
> +      return;
> +
> +   if (brw->gen >= 6)
> +      start_oa_counters(brw);
> +}

Quick question for my understanding: Does this mean that at the end of
each batch we're guaranteed that OA_CTL is again 0?

Some of the considerations that lead to this:
- OA counters are a pretty great information leak. Atm we don't care one
  bit due to lack of ppgtt, but once we have that and once we also have
  the paranoid CS checker we could enforce this in the kernel.

- Usually perf counters requires some priviledges. Atm we can't enforce
  this (and don't care) but should we prep for this with a feature flag
  and an execbuf flag? The execbuf flag might help a bit implementing the
  next point in the kernel, but I don't think it's needed. Without flags
  plan B would be to noop-out OA/perfcounter stuff with the CS checker.

- What about OA users on other rings? Iirc that stuff might be useful for
  opencl running on the VCS ring, too. Otoh we don't have that need right
  now, and the kernel could always just pass an OA token between rings and
  synchronize with semaphores. Without the execbuf flag we'd simply have
  to rely on the CS checker.

Anyway just a few questions I wanted to throw out there for consideration,
current approach looks fine to me. I've read through the entire pile of
patches, so

Acked-by: Daniel Vetter <daniel.vetter at ffwll.ch>

on the series. Whatever that's worth ;-)

Cheers, Daniel
> +
> +/**
> + * Called at the end of every render ring batch.
> + *
> + * Disable the OA counters.
> + *
> + * This relies on there being enough space in BATCH_RESERVED.
> + */
> +void
> +brw_perf_monitor_finish_batch(struct brw_context *brw)
> +{
> +   assert(brw->batch.ring == RENDER_RING);
> +
> +   if (brw->perfmon.oa_users == 0)
> +      return;
> +
> +   if (brw->gen >= 6)
> +      stop_oa_counters(brw);
> +}
> +
> +/******************************************************************************/
> +
>  void
>  brw_init_performance_monitors(struct brw_context *brw)
>  {
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> index b50b6c7..e3f826d 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> @@ -187,6 +187,9 @@ intel_batchbuffer_emit_render_ring_prelude(struct brw_context *brw)
>      * what that batch contributed.  Emit state packets to write them to a BO.
>      */
>     brw_emit_query_begin(brw);
> +
> +   /* We may also need to enable OA counters. */
> +   brw_perf_monitor_new_batch(brw);
>  }
>  
>  /**
> @@ -247,6 +250,10 @@ brw_finish_batch(struct brw_context *brw)
>      */
>     brw_emit_query_end(brw);
>  
> +   /* We may also need to disable OA counters. */
> +   if (brw->batch.ring == RENDER_RING)
> +      brw_perf_monitor_finish_batch(brw);
> +
>     if (brw->curbe.curbe_bo) {
>        drm_intel_gem_bo_unmap_gtt(brw->curbe.curbe_bo);
>        drm_intel_bo_unreference(brw->curbe.curbe_bo);
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> index 38149f3..d14127e 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> @@ -19,8 +19,9 @@ extern "C" {
>   * - Optional MI_NOOP for ensuring the batch length is qword aligned (4 bytes)
>   * - Any state emitted by vtbl->finish_batch():
>   *   - Gen4-5 record ending occlusion query values (4 * 4 = 16 bytes)
> + *   - Disabling OA counters on Gen6+ (3 DWords = 12 bytes)
>   */
> -#define BATCH_RESERVED 24
> +#define BATCH_RESERVED 36
>  
>  struct intel_batchbuffer;
>  
> -- 
> 1.8.3.2
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the mesa-dev mailing list