[Intel-gfx] [PATCH 00/15] Framework to collect command stream gpu metrics using i915 perf

sourab.gupta at intel.com sourab.gupta at intel.com
Fri Nov 4 09:30:29 UTC 2016


From: Sourab Gupta <sourab.gupta at intel.com>

Refloating the series rebased on Robert's latest patchset. Since Robert's
patches are being reviewed and this patch series extends his framework to
enable multiple concurrent streams to capture command stream based metrics,
it would be good to keep this work in perspective.
Looking to receive feedback on the series (and possibly r-b's :))

This series adds framework for collection of gpu performance metrics
associated with the command stream of a particular engine. These metrics
include OA reports, timestamps, mmio metrics, etc. These metrics are
are collected around batchbuffer boundaries.

This work utilizes the underlying infrastructure introduced in Robert Bragg's
patches for collecting periodic OA counter snapshots (based on Haswell):
https://patchwork.freedesktop.org/series/14505/

This patch set is based on Gen8+ version of Robert's patches which can be found
here: https://github.com/rib/linux/tree/wip/rib/oa-next

In the last series floated earlier
(https://patchwork.freedesktop.org/series/6154/), based on Chris's suggestion,
I had tried experimenting with using the cross timestamp framework for the
purpose of retrieving tightly coupled device/system timestamps. In our case,
this framework enables us to have correlated pairs of gpu+system time which
can be used over a period of time to correct the frequency of timestamp clock,
and thus enable to accurately send system time (_MONO_RAW) as requested to the
userspace. The results are generally observed to quite better with the use of
cross timestamps and the frequency delta gradually tapers down to 0 with
increasing correction periods.
The use of cross timestamp framework though requires us to have
clockcounter/timecounter abstraction for the timestamp clocksource, and
further requires few changes in the kernel timekeeping/clocksource code. I am
looking for feedback on the use of this framework and the changes involved.

These patches can be found for viewing at
https://github.com/sourabgu/linux/tree/oa-19oct

Sourab Gupta (15):
  drm/i915: Add ctx getparam ioctl parameter to retrieve ctx unique id
  drm/i915: Expose OA sample source to userspace
  drm/i915: Framework for capturing command stream based OA reports
  drm/i915: flush periodic samples, in case of no pending CS sample
    requests
  drm/i915: Handle the overflow condition for command stream buf
  drm/i915: Populate ctx ID for periodic OA reports
  drm/i915: Add support for having pid output with OA report
  drm/i915: Add support for emitting execbuffer tags through OA counter
    reports
  drm/i915: Extend i915 perf framework for collecting timestamps on all
    gpu engines
  drm/i915: Extract raw GPU timestamps from OA reports to forward in
    perf samples
  drm/i915: Support opening multiple concurrent perf streams
  time: Expose current clocksource in use by timekeeping framework
  time: export clocks_calc_mult_shift
  drm/i915: Mechanism to forward clock monotonic raw time in perf
    samples
  drm/i915: Support for capturing MMIO register values

 drivers/gpu/drm/i915/i915_drv.c            |    2 +
 drivers/gpu/drm/i915/i915_drv.h            |  112 +-
 drivers/gpu/drm/i915/i915_gem_context.c    |    3 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    6 +
 drivers/gpu/drm/i915/i915_perf.c           | 1911 ++++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_reg.h            |    6 +
 include/linux/timekeeping.h                |    5 +
 include/uapi/drm/i915_drm.h                |   79 ++
 kernel/time/clocksource.c                  |    1 +
 kernel/time/timekeeping.c                  |   12 +
 10 files changed, 1910 insertions(+), 227 deletions(-)

-- 
1.9.1



More information about the Intel-gfx mailing list