[PATCH 00/23] drm/etnaviv: support performance counters
Christian Gmeiner
christian.gmeiner at gmail.com
Tue Sep 12 15:11:32 UTC 2017
In a perfect world we would be able to read GPU registers of interest
via the command stream with a 'read-register' command/package. For perf
counters it is a must to read them synchronized with the GPU to put the
values in relation to a draw command. As Vivante GPUs do not provide this
functionality we need to emulate it in software.
We need to support three different kind of perf register types:
1) normal register
This is the easierst case where we can simply read the register and we
are done.
2) debug register
We need to configure the mux register and then read the debug register value.
3) pipeline register
We need to 'iterate' over all pixel pipes and sum up the values. The 'iteration'
is done by select the pipe of interest via HI_CLOCK_CONTROL_DEBUG_PIXEL_PIPE.
There is also need to configure the mux register.
Allowing the userspace to do it all by its own feels quite error prone and not
future-proof. Thats why the kernel exports all performance domains and their
signals to the userspace via two new ioctls. So the kernel knows all performance
counters and how to sample them.
At the moment all performacne domains and signals get exported to all gpu pipe types,
but that can be changed in follow-up patches.
struct drm_etnaviv_gem_submit was extended to include so-called performance monitor
requests (pmrs). A request defines what domain and signal should be sampled (pre/post
draw cmdbuffer) and where to store the result.
The whole series can be found here:
https://github.com/austriancoder/linux/tree/perfmon-v4
The used libdrm and mesa branches to test this feature can be found here:
https://github.com/austriancoder/libdrm/commits/perfmon-v4
https://github.com/austriancoder/mesa/commits/perfmon-v4
GALLIUM_HUD=help will report following queries names:
fps
cpu
cpu0
cpu1
cpu2
cpu3
prims-emitted
draw-calls
rs-operations
hi-total-cyles
hi-idle-cyles
hi-axi-cycles-read-request-stalled
hi-axi-cycles-write-request-stalled
hi-axi-cycles-write-data-stalled
pe-pixel-count-killed-by-color-pipe
pe-pixel-count-killed-by-depth-pipe
pe-pixel-count-drawn-by-color-pipe
pe-pixel-count-drawn-by-depth-pipe
pe-pixels-rendered-2d
sh-shader-cycles
sh-ps-inst-counter
sh-rendered-pixel-counter
sh-vs-inst-counter
sh-rendered-vertice-counter
sh-vtx-branch-inst-counter
sh-vtx-texld-inst-counter
sh-plx-branch-inst-counter
sh-plx-texld-inst-counter
pa-input-vtx-counter
pa-input-prim-counter
pa-putput-prim-counter
pa-depth-clipped-counter
pa-trivial-rejected-counter
pa-culled-counter
se-culled-triangle-count
se-culled-lines-count
ra-valid-pixel-count
ra-total-quad-count
ra-valid-quad-count-after-early-z
ra-total-primitive-count
ra-pipe-cache-miss-counter
ra-prefetch-cache-miss-counter
ra-pculled-quad-count
tx-total-bilinear-requests
tx-total-trilinear-requests
tx-total-discarded-texutre-requests
tx-total-texutre-requests
tx-mem-read-count
tx-mem-read-in-8b-count
tx-cache-miss-count
tx-cache-hit-texel-count
tx-cache-miss-texel-count
mc-total-read-req-8b-from-pipeline
mc-total-read-req-8b-from-ip
mc-total-write-req-8b-from-pipeline
Changes v1 -> v2:
- reworked events
- reworked uapi
- reworked enumeration of domains and signals
- process sync point with a work item to keep irq as fast as possible
- prevent GPU hang when reading pixel pipeline perf values
- all SH perf counters are accessed via perf_reg_read(..)
Changes v2 -> v3:
- reworked alloc_event(..)
- fixed pmr flag validation
Changes v3 -> v4:
- cherry picked the correct commits (patches 03 and 04)
Happy reviewing!
Christian Gmeiner (23):
drm/etnaviv: use bitmap to keep track of events
drm/etnaviv: make it possible to allocate multiple events
drm/etnaviv: add infrastructure to query perf counter
drm/etnaviv: add uapi for perfmon feature
drm/etnaviv: add internal representation of perfmon_request
drm/etnaviv: extend etnaviv_gpu_cmdbuf_new(..) with nr_pmrs
drm/etnaviv: add performance monitor request validation
drm/etnaviv: copy pmrs from userspace
drm/etnaviv: add performance monitor request processing
drm/etnaviv: add 'sync point' support
drm/etnaviv: clear alloced event
drm/etnaviv: use 'sync points' for performance monitor requests
drm/etnaviv: add HI perf domain
drm/etnaviv: add PE perf domain
drm/etnaviv: add SH perf domain
drm/etnaviv: add PA perf domain
drm/etnaviv: add SE perf domain
drm/etnaviv: add RA perf domain
drm/etnaviv: add TX perf domain
drm/etnaviv: add MC perf domain
drm/etnaviv: need to disable clock gating when doing profiling
drm/etnaviv: enable debug registers on demand
drm/etnaviv: submit supports performance monitor requests
drivers/gpu/drm/etnaviv/Makefile | 3 +-
drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 36 +++
drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.c | 15 +-
drivers/gpu/drm/etnaviv/etnaviv_cmdbuf.h | 6 +-
drivers/gpu/drm/etnaviv/etnaviv_drv.c | 39 ++-
drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 +
drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 69 +++-
drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 216 ++++++++++---
drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 13 +-
drivers/gpu/drm/etnaviv/etnaviv_perfmon.c | 451 +++++++++++++++++++++++++++
drivers/gpu/drm/etnaviv/etnaviv_perfmon.h | 48 +++
include/uapi/drm/etnaviv_drm.h | 43 ++-
12 files changed, 883 insertions(+), 57 deletions(-)
create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_perfmon.c
create mode 100644 drivers/gpu/drm/etnaviv/etnaviv_perfmon.h
--
2.13.5
More information about the etnaviv
mailing list