[Intel-gfx] [RFC 3/4] drm/i915/perf: Extract raw GPU timestamps from OA reports

Sagar Arun Kamble sagar.a.kamble at intel.com
Thu Dec 21 08:50:27 UTC 2017



On 12/7/2017 1:25 AM, Lionel Landwerlin wrote:
> On 15/11/17 12:13, Sagar Arun Kamble wrote:
>> From: Sourab Gupta <sourab.gupta at intel.com>
>>
>> The OA reports contain the least significant 32 bits of the gpu 
>> timestamp.
>> This patch enables retrieval of the timestamp field from OA reports, to
>> forward as 64 bit raw gpu timestamps in the perf samples.
>>
>> v2: Rebase w.r.t new timecounter support.
>>
>> Signed-off-by: Sourab Gupta <sourab.gupta at intel.com>
>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>> Cc: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Sourab Gupta <sourab.gupta at intel.com>
>> Cc: Matthew Auld <matthew.auld at intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h  |  2 ++
>>   drivers/gpu/drm/i915/i915_perf.c | 26 +++++++++++++++++++++++++-
>>   2 files changed, 27 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index e08bc85..5534cd2 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2151,6 +2151,8 @@ struct i915_perf_stream {
>>        */
>>       struct i915_oa_config *oa_config;
>>   +    u64 last_gpu_ts;
>> +
>>       /**
>>        * System time correlation variables.
>>        */
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c 
>> b/drivers/gpu/drm/i915/i915_perf.c
>> index f7e748c..3b721d7 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -575,6 +575,26 @@ static int append_oa_status(struct 
>> i915_perf_stream *stream,
>>   }
>>     /**
>> + * get_gpu_ts_from_oa_report - Retrieve absolute gpu timestamp from 
>> OA report
>> + *
>> + * Note: We are assuming that we're updating last_gpu_ts frequently 
>> enough so
>> + * that it's never possible to see multiple overflows before we compare
>> + * sample_ts to last_gpu_ts. Since this is significantly large duration
>> + * (~6min for 80ns ts base), we can safely assume so.
>> + */
>> +static u64 get_gpu_ts_from_oa_report(struct i915_perf_stream *stream,
>> +                     const u8 *report)
>> +{
>> +    u32 sample_ts = *(u32 *)(report + 4);
>> +    u32 delta;
>> +
>> +    delta = sample_ts - (u32)stream->last_gpu_ts;
>> +    stream->last_gpu_ts += delta;
>> +
>> +    return stream->last_gpu_ts;
>> +}
>> +
>> +/**
>>    * append_oa_sample - Copies single OA report into userspace read() 
>> buffer.
>>    * @stream: An i915-perf stream opened for OA metrics
>>    * @buf: destination buffer given by userspace
>> @@ -622,7 +642,9 @@ static int append_oa_sample(struct 
>> i915_perf_stream *stream,
>>       }
>>         if (sample_flags & SAMPLE_GPU_TS) {
>> -        /* Timestamp to be populated from OA report */
>> +        /* Timestamp populated from OA report */
>> +        gpu_ts = get_gpu_ts_from_oa_report(stream, report);
>> +
>>           if (copy_to_user(buf, &gpu_ts, I915_PERF_TS_SAMPLE_SIZE))
>>               return -EFAULT;
>>       }
>
> I think everything above this line should be merged int patch 2.
> It's better to have a single functional patch.
Yes. Will merge.
>
>> @@ -2421,6 +2443,8 @@ static u64 i915_cyclecounter_read(const struct 
>> cyclecounter *cc)
>>                       GEN7_TIMESTAMP_UDW);
>>       intel_runtime_pm_put(dev_priv);
>>   +    stream->last_gpu_ts = ts_count;
>
> This doesn't look right. You're already adding a delta in 
> get_gpu_ts_from_oa_report().
> This will produce incorrect timestamps. Since at the moment we won't 
> allow opening without PROP_SAMPLE_OA, I would just drop this line.
>
Yes. makes sense. will remove.
>> +
>>       return ts_count;
>>   }
>
>



More information about the Intel-gfx mailing list