[PATCH 00/14] i915 perf support for command stream based OA, GPU and workload metrics capture
Kamble, Sagar A
sagar.a.kamble at intel.com
Thu Jul 27 09:29:26 UTC 2017
Hi Lionel,
I had put this series in trybot so far to address the time sync. related functionality.
Based on my analysis so far, we will not need the changes in kernel/time/timekeeping.c done by Sourab in one of the patch.
I have reworked that functionality and currently testing the same. Will update in couple of days.
If required, I will only send other patches addressing your inputs so far on intel-gfx by tomorrow.
Thanks
Sagar
-----Original Message-----
From: Landwerlin, Lionel G
Sent: Monday, July 24, 2017 5:15 PM
To: Kamble, Sagar A <sagar.a.kamble at intel.com>; Intel-gfx-trybot at lists.freedesktop.org
Cc: Auld, Matthew <matthew.auld at intel.com>; Chris Wilson <chris at chris-wilson.co.uk>
Subject: Re: [PATCH 00/14] i915 perf support for command stream based OA, GPU and workload metrics capture
Hi Sagar,
I've started looking into your series.
It doesn't appear in the main intel-gfx mailing list (Intel-gfx at lists.freedesktop.org).
Would you mind send it there next time?
Thanks!
-
Lionel
On 14/07/17 19:51, Sagar Arun Kamble wrote:
> This series is prepared from below two series posted by Sourab in March.
> 1. https://patchwork.freedesktop.org/series/21351/ - Collect command stream
> based OA reports using i915 perf
> 2. https://patchwork.freedesktop.org/series/21352/ - Collect command stream
> based GPU metrics for all engines using i915 perf
>
> This series addresses most of the review comments from above two.
> Major change is moving the stream structure and information from
> dev_priv to per-engine structures. Reframing below the intent of this
> series from cover letter of earlier series.
>
> This series adds framework for
> 1. Collection of OA reports associated with the render command stream,
> which are collected around batchbuffer boundaries.
> 2. Collect other metadata such as ctx_id, pid, tag etc. with the
> samples, and thus we can establish the association of samples
> collected with the corresponding process/workload.
> 3. Collection of GPU performance metrics associated with the command
> stream of a particular engine. These metrics include timestamps of
> work submission and completion on engines, mmio metrics, etc. These
> metrics are are collected around batchbuffer boundaries.
>
> There are a couple of patches which add support for using the
> cross-timestamp framework for retrieving tightly coupled device/system timestamps.
> In our case, this framework enables us to have correlated pairs of
> gpu+system time which can be used over a period of time to correct the
> frequency of timestamp clock, and thus enable to accurately send
> system time (_MONO_RAW) as requested to the userspace. The results are
> generally observed to quite better with the use of cross timestamps
> and the frequency delta gradually tapers down to 0 with increasing correction periods.
> The use of cross timestamp framework though requires us to have
> clockcounter/timecounter abstraction for the timestamp clocksource,
> and further requires few changes in the kernel timekeeping/clocksource code.
>
> Pending issues to be addressed in this series:
> 1. cross-timestamp sync patches need to be reworked as requested by kernel
> maintainers.
> 2. Some of the data types being collected through these patches can be done in
> the userspace and that is yet to be finalized.
> 3. Add support in the perf IGT tests for verifying CS based perf functionality.
>
> Cc: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
> Cc: Matthew Auld <matthew.auld at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>
> Sourab Gupta (14):
> drm/i915: Add ctx getparam ioctl parameter to retrieve ctx unique id
> drm/i915: Expose OA sample source to userspace
> drm/i915: Framework for capturing command stream based OA reports and
> ctx id info.
> drm/i915: Flush periodic samples, in case of no pending CS sample
> requests
> drm/i915: Inform userspace about command stream OA buf overflow
> drm/i915: Populate ctx ID for periodic OA reports
> drm/i915: Add support for having pid output with OA report
> drm/i915: Add support for emitting execbuffer tags through OA counter
> reports
> drm/i915: Add support for collecting timestamps on all gpu engines
> drm/i915: Extract raw GPU timestamps from OA reports to forward in
> perf samples
> drm/i915: Async check for streams data availability with hrtimer
> rescheduling
> time: Expose current clocksource in use by timekeeping framework
> drm/i915: Mechanism to forward clock monotonic raw time in perf
> samples
> drm/i915: Support for capturing MMIO register values
>
> drivers/gpu/drm/i915/i915_drv.c | 15 +
> drivers/gpu/drm/i915/i915_drv.h | 194 ++-
> drivers/gpu/drm/i915/i915_gem.c | 1 +
> drivers/gpu/drm/i915/i915_gem_context.c | 3 +
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +
> drivers/gpu/drm/i915/i915_perf.c | 2022 ++++++++++++++++++++++++----
> drivers/gpu/drm/i915/i915_reg.h | 6 +
> drivers/gpu/drm/i915/intel_engine_cs.c | 4 +
> drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +
> drivers/gpu/drm/i915/intel_ringbuffer.h | 8 +
> include/linux/timekeeping.h | 5 +
> include/uapi/drm/i915_drm.h | 76 ++
> kernel/time/timekeeping.c | 12 +
> 13 files changed, 2110 insertions(+), 249 deletions(-)
>
More information about the Intel-gfx-trybot
mailing list