[Intel-gfx] [RFC 0/4] GPU/CPU timestamps correlation for relating OA samples with system events

Fri Dec 22 10:16:17 UTC 2017

On 22/12/17 09:30, Sagar Arun Kamble wrote:
>
>
>
> On 12/21/2017 6:29 PM, Lionel Landwerlin wrote:
>> Some more findings I made while playing with this series & GPUTop.
>> Turns out the 2ms drift per second is due to timecounter. Adding the 
>> delta this way :
>>
>> https://github.com/djdeath/linux/commit/7b002cb360483e331053aec0f98433a5bd5c5c3f#diff-9b74bd0cfaa90b601d80713c7bd56be4R607
>>
>> Eliminates the drift.
> I see two imp. changes 1. approximation of start time during 
> init_timecounter 2. overflow handling in delta accumulation.
> With these incorporated, I guess timecounter should also work in same 
> fashion.

I think the arithmetic in timecounter is inherently lossy and that's why 
we're seeing a drift. Could we be using it wrong?

In the patch above, I think there is still a drift because of the 
potential fractional part loss at every delta we add.
But it should only be a fraction of a nanosecond multiplied by the 
number of reports over a period of time.
With a report every 1us, that should still be much less than a 1ms of 
drift over 1s.

We can probably do better by always computing the clock using the entire 
delta rather than the accumulated delta.

>> Timelines of perf i915 tracepoints & OA reports now make a lot more 
>> sense.
>>
>> There is still the issue that reading the CPU clock & the RCS 
>> timestamp is inherently not atomic. So there is a delta there.
>> I think we should add a new i915 perf record type to express the 
>> delta that we measure this way :
>>
>> https://github.com/djdeath/linux/commit/7b002cb360483e331053aec0f98433a5bd5c5c3f#diff-9b74bd0cfaa90b601d80713c7bd56be4R2475
>>
>> So that userspace knows there might be a global offset between the 2 
>> times and is able to present it.
> agree on this. Delta ns1-ns0 can be interpreted as max drift.
>> Measurement on my KBL system were in the order of a few microseconds 
>> (~30us).
>> I guess we might be able to setup the correlation point better 
>> (masking interruption?) to reduce the delta.
> already using spin_lock. Do you mean NMI?

I don't actually know much on this point.
if spin_lock is the best we can do, then that's it :)

>>
>> Thanks,
>>
>> -
>> Lionel
>>
>>
>> On 07/12/17 00:57, Robert Bragg wrote:
>>>
>>>
>>> On Thu, Dec 7, 2017 at 12:48 AM, Robert Bragg <robert at sixbynine.org 
>>> <mailto:robert at sixbynine.org>> wrote:
>>>
>>>
>>>     at least from what I wrote back then it looks like I was seeing
>>>     a drift of a few milliseconds per second on SKL. I vaguely
>>>     recall it being much worse given the frequency constants we had
>>>     for Haswell.
>>>
>>>
>>> Sorry I didn't actually re-read my own message properly before 
>>> referencing it :) Apparently the 2ms per second drift was for 
>>> Haswell, so presumably not quite so bad for SKL.
>>>
>>> - Robert
>>>
>>>
>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20171222/e833fb1e/attachment.html>