[PATCH 00/23] drm/etnaviv: support performance counters

Christian Gmeiner christian.gmeiner at gmail.com
Tue Sep 12 15:11:32 UTC 2017


In a perfect world we would be able to read GPU registers of interest
via the command stream with a 'read-register' command/package. For perf
counters it is a must to read them synchronized with the GPU to put the
values in relation to a draw command. As Vivante GPUs do not provide this
functionality we need to emulate it in software.


We need to support three different kind of perf register types:

1) normal register
  This is the easierst case where we can simply read the register and we
  are done.

2) debug register
  We need to configure the mux register and then read the debug register value.

3) pipeline register
  We need to 'iterate' over all pixel pipes and sum up the values. The 'iteration'
  is done by select the pipe of interest via HI_CLOCK_CONTROL_DEBUG_PIXEL_PIPE.
  There is also need to configure the mux register.


Allowing the userspace to do it all by its own feels quite error prone and not
future-proof. Thats why the kernel exports all performance domains and their
signals to the userspace via two new ioctls. So the kernel knows all performance
counters and how to sample them.

At the moment all performacne domains and signals get exported to all gpu pipe types,
but that can be changed in follow-up patches.

struct drm_etnaviv_gem_submit was extended to include so-called performance monitor
requests (pmrs). A request defines what domain and signal should be sampled (pre/post
draw cmdbuffer) and where to store the result.

The whole series can be found here:
https://github.com/austriancoder/linux/tree/perfmon-v4

The used libdrm and mesa branches to test this feature can be found here:
https://github.com/austriancoder/libdrm/commits/perfmon-v4
https://github.com/austriancoder/mesa/commits/perfmon-v4

GALLIUM_HUD=help will report following queries names:
    fps
    cpu
    cpu0
    cpu1
    cpu2
    cpu3
    prims-emitted
    draw-calls
    rs-operations
    hi-total-cyles
    hi-idle-cyles
    hi-axi-cycles-read-request-stalled
    hi-axi-cycles-write-request-stalled
    hi-axi-cycles-write-data-stalled
    pe-pixel-count-killed-by-color-pipe
    pe-pixel-count-killed-by-depth-pipe
    pe-pixel-count-drawn-by-color-pipe
    pe-pixel-count-drawn-by-depth-pipe
    pe-pixels-rendered-2d
    sh-shader-cycles
    sh-ps-inst-counter
    sh-rendered-pixel-counter
    sh-vs-inst-counter
    sh-rendered-vertice-counter
    sh-vtx-branch-inst-counter
    sh-vtx-texld-inst-counter
    sh-plx-branch-inst-counter
    sh-plx-texld-inst-counter
    pa-input-vtx-counter
    pa-input-prim-counter
    pa-putput-prim-counter
    pa-depth-clipped-counter
    pa-trivial-rejected-counter
    pa-culled-counter
    se-culled-triangle-count
    se-culled-lines-count
    ra-valid-pixel-count
    ra-total-quad-count
    ra-valid-quad-count-after-early-z
    ra-total-primitive-count
    ra-pipe-cache-miss-counter
    ra-prefetch-cache-miss-counter
    ra-pculled-quad-count
    tx-total-bilinear-requests
    tx-total-trilinear-requests
    tx-total-discarded-texutre-requests
    tx-total-texutre-requests
    tx-mem-read-count
    tx-mem-read-in-8b-count
    tx-cache-miss-count
    tx-cache-hit-texel-count
    tx-cache-miss-texel-count
    mc-total-read-req-8b-from-pipeline
    mc-total-read-req-8b-from-ip
    mc-total-write-req-8b-from-pipeline

Changes v1 -> v2:
 - reworked events
 - reworked uapi
 - reworked enumeration of domains and signals
 - process sync point with a work item to keep irq as fast as possible
 - prevent GPU hang when reading pixel pipeline perf values
 - all SH perf counters are accessed via perf_reg_read(..)

Changes v2 -> v3:
 - reworked alloc_event(..)
 - fixed pmr flag validation

Changes v3 -> v4:
 - cherry picked the correct commits (patches 03 and 04)

Happy reviewing!

Christian Gmeiner (23):
  drm/etnaviv: use bitmap to keep track of events
  drm/etnaviv: make it possible to allocate multiple events
  drm/etnaviv: add infrastructure to query perf counter
  drm/etnaviv: add uapi for perfmon feature
  drm/etnaviv: add internal representation of perfmon_request
  drm/etnaviv: extend etnaviv_gpu_cmdbuf_new(..) with nr_pmrs
  drm/etnaviv: add performance monitor request validation
  drm/etnaviv: copy pmrs from userspace
  drm/etnaviv: add performance monitor request processing
  drm/etnaviv: add 'sync point' support
  drm/etnaviv: clear alloced event
  drm/etnaviv: use 'sync points' for performance monitor requests
  drm/etnaviv: add HI perf domain
  drm/etnaviv: add PE perf domain
  drm/etnaviv: add SH perf domain
  drm/etnaviv: add PA perf domain
  drm/etnaviv: add SE perf domain
  drm/etnaviv: add RA perf domain
  drm/etnaviv: add TX perf domain
  drm/etnaviv: add MC perf domain
  drm/etnaviv: need to disable clock gating when doing profiling
  drm/etnaviv: enable debug registers on demand
  drm/etnaviv: submit supports performance monitor requests

 drivers/gpu/drm/etnaviv/Makefile             |   3 +-
 drivers/gpu/drm/etnaviv/etnaviv_buffer.c     |  36 +++
 drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c     |  15 +-
 drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h     |   6 +-
 drivers/gpu/drm/etnaviv/etnaviv_drv.c        |  39 ++-
 drivers/gpu/drm/etnaviv/etnaviv_drv.h        |   1 +
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  69 +++-
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c        | 216 ++++++++++---
 drivers/gpu/drm/etnaviv/etnaviv_gpu.h        |  13 +-
 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c    | 451 +++++++++++++++++++++++++++
 drivers/gpu/drm/etnaviv/etnaviv_perfmon.h    |  48 +++
 include/uapi/drm/etnaviv_drm.h               |  43 ++-
 12 files changed, 883 insertions(+), 57 deletions(-)
 create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
 create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_perfmon.h

-- 
2.13.5



More information about the etnaviv mailing list