[Intel-gfx] [PATCH v3 00/11] Enable Gen 7 Observation Architecture

Robert Bragg robert at sixbynine.org
Mon Aug 15 14:41:17 UTC 2016


Mostly just a rebase on a more recent nightly, except for an update to how
POLLIN events are reported to reduce the CPU overhead that was otherwise seen
while running gputop.

The problem seen with poll was largely a fault with gputop having multiple
redundant 200ms, but out-of-phase, timers tracked in its mainloop resulting in
excessive poll wake ups due to timeouts. This was fixed, but still it
highlighted that the per-fd code that runs after a poll wait wakes (regardless
of the cause of the wake up) can easily become hot and i915 mmio reads here can
quickly jump to the top of a cpu profile.

The main value of the i915-perf poll implementation is for throttling more than
for having a quick notification of new data, so it works nicely to only rely on
the hrtimer callback that's polling for new data to be the thing that flags
POLLIN events and check for that after the wait without any mmio.


Just to plug gputop as a tool for visualising these metrics with, I'm now
automatically publishing a demo of the interface via Travis to
http://gputop.github.io which is also usable with local targets if you point a
browser at E.g.: http://gputop.github.io?target=localhost while you have the
gputop server running. Apologies that currently the demo site on its own (not
connected to real hardware) doesn't show much since it doesn't have any fake
metric values to graph, though it is possible to at least browse the different
platform metric sets and selecting individual counters will show the
description + normalization equation.  Having the web UI hosted on github
hopefully lowers the bar to trying it out since it avoids needing to set up
Emscripten first as a build dependency.

Regards,
- Robert

Robert Bragg (11):
  drm/i915: Add i915 perf infrastructure
  drm/i915: rename OACONTROL GEN7_OACONTROL
  drm/i915: return EACCES for check_cmd() failures
  drm/i915: don't whitelist oacontrol in cmd parser
  drm/i915: Add 'render basic' Haswell OA unit config
  drm/i915: Enable i915 perf stream for Haswell OA unit
  drm/i915: advertise available metrics via sysfs
  drm/i915: Add dev.i915.perf_event_paranoid sysctl option
  drm/i915: add oa_event_min_timer_exponent sysctl
  drm/i915: Add more Haswell OA metric sets
  drm/i915: Add a kerneldoc summary for i915_perf.c

 drivers/gpu/drm/i915/Makefile           |    4 +
 drivers/gpu/drm/i915/i915_cmd_parser.c  |   40 +-
 drivers/gpu/drm/i915/i915_drv.c         |    6 +
 drivers/gpu/drm/i915/i915_drv.h         |  157 +++
 drivers/gpu/drm/i915/i915_gem_context.c |   22 +-
 drivers/gpu/drm/i915/i915_oa_hsw.c      |  659 +++++++++++++
 drivers/gpu/drm/i915/i915_oa_hsw.h      |   38 +
 drivers/gpu/drm/i915/i915_perf.c        | 1615 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h         |  340 ++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.c |    7 +-
 include/uapi/drm/i915_drm.h             |  133 +++
 11 files changed, 2979 insertions(+), 42 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_oa_hsw.c
 create mode 100644 drivers/gpu/drm/i915/i915_oa_hsw.h
 create mode 100644 drivers/gpu/drm/i915/i915_perf.c

-- 
2.9.2



More information about the Intel-gfx mailing list