[Intel-gfx] [RFC 4/4] drm/i915/perf: Send system clock monotonic time in perf samples
Chris Wilson
chris at chris-wilson.co.uk
Wed Nov 15 12:31:03 UTC 2017
Quoting Sagar Arun Kamble (2017-11-15 12:13:54)
> From: Sourab Gupta <sourab.gupta at intel.com>
>
> Currently, we have the ability to only forward the GPU timestamps in the
> samples (which are generated via OA reports). This limits the ability to
> correlate these samples with the system events.
>
> An ability is therefore needed to report timestamps in different clock
> domains, such as CLOCK_MONOTONIC, in the perf samples to be of more
> practical use to the userspace. This ability becomes important
> when we want to correlate/plot GPU events/samples with other system events
> on the same timeline (e.g. vblank events, or timestamps when work was
> submitted to kernel, etc.)
>
> The patch here proposes a mechanism to achieve this. The correlation
> between gpu time and system time is established using the timestamp clock
> associated with the command stream, abstracted as timecounter/cyclecounter
> to retrieve gpu/system time correlated values.
>
> v2: Added i915_driver_init_late() function to capture the new late init
> phase for perf (Chris)
>
> v3: Removed cross-timestamp changes.
>
> Signed-off-by: Sourab Gupta <sourab.gupta at intel.com>
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble at intel.com>
> Cc: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Sourab Gupta <sourab.gupta at intel.com>
> Cc: Matthew Auld <matthew.auld at intel.com>
> ---
> drivers/gpu/drm/i915/i915_perf.c | 27 +++++++++++++++++++++++++++
> include/uapi/drm/i915_drm.h | 7 +++++++
> 2 files changed, 34 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 3b721d7..94ee924 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -336,6 +336,7 @@
>
> #define SAMPLE_OA_REPORT BIT(0)
> #define SAMPLE_GPU_TS BIT(1)
> +#define SAMPLE_SYSTEM_TS BIT(2)
>
> /**
> * struct perf_open_properties - for validated properties given to open a stream
> @@ -622,6 +623,7 @@ static int append_oa_sample(struct i915_perf_stream *stream,
> struct drm_i915_perf_record_header header;
> u32 sample_flags = stream->sample_flags;
> u64 gpu_ts = 0;
> + u64 system_ts = 0;
>
> header.type = DRM_I915_PERF_RECORD_SAMPLE;
> header.pad = 0;
> @@ -647,6 +649,23 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>
> if (copy_to_user(buf, &gpu_ts, I915_PERF_TS_SAMPLE_SIZE))
> return -EFAULT;
> + buf += I915_PERF_TS_SAMPLE_SIZE;
> + }
> +
> + if (sample_flags & SAMPLE_SYSTEM_TS) {
> + gpu_ts = get_gpu_ts_from_oa_report(stream, report);
Scope your variables. Stops us from being confused as to where else
gpu_ts or sys_ts may be reused. For instance I first thought you were
using SAMPLE_GPU_TS to initialise gpu_ts
> + /*
> + * XXX: timecounter_cyc2time considers time backwards if delta
> + * timestamp is more than half the max ns time covered by
> + * counter. It will be ~35min for 36 bit counter. If this much
> + * sampling duration is needed we will have to update tc->nsec
> + * by explicitly reading the timecounter (timecounter_read)
> + * before this duration.
> + */
> + system_ts = timecounter_cyc2time(&stream->tc, gpu_ts);
> +
> + if (copy_to_user(buf, &system_ts, I915_PERF_TS_SAMPLE_SIZE))
> + return -EFAULT;
Advance buf.
More information about the Intel-gfx
mailing list