<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><br>
</p>
<br>
<div class="moz-cite-prefix">On 12/7/2017 6:18 AM, Robert Bragg
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, Nov 15, 2017 at 12:13 PM,
Sagar Arun Kamble <span dir="ltr"><<a
href="mailto:sagar.a.kamble@intel.com" target="_blank"
moz-do-not-send="true">sagar.a.kamble@intel.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">We can compute system
time corresponding to GPU timestamp by taking a<br>
reference point (CPU monotonic time, GPU timestamp) and
then adding<br>
delta time computed using timecounter/cyclecounter support
in kernel.<br>
We have to configure cyclecounter with the GPU timestamp
frequency.<br>
Earlier approach that was based on cross-timestamp is not
needed. It<br>
was being used to approximate the frequency based on
invalid assumptions<br>
(possibly drift was being seen in the time due to
precision issue).<br>
The precision of time from GPU clocks is already in ns and
timecounter<br>
takes care of it as verified over variable durations.<br>
</blockquote>
<div><br>
</div>
<div>Hi Sagar,</div>
<div><br>
</div>
<div>I have some doubts about this analysis...<br>
</div>
<div><br>
</div>
<div>The intent behind Sourab's original approach was to be
able to determine the frequency at runtime empirically
because the constants we have aren't particularly
accurate. Without a perfectly stable frequency that's
known very precisely then an interpolated correlation will
inevitably drift. I think the nature of HW implies we
can't expect to have either of those. Then the general
idea had been to try and use existing kernel
infrastructure for a problem which isn't unique to GPU
clocks.</div>
</div>
</div>
</div>
</blockquote>
Hi Robert,<br>
<br>
Testing on SKL shows timestamps drift only about 10us for sampling
done in kernel for about 30min time.<br>
Verified with changes from <a
href="https://github.com/sakamble/i915-timestamp-support/commits/drm-tip">https://github.com/sakamble/i915-timestamp-support/commits/drm-tip</a><br>
Note that since we are sampling counter in debugfs, there is likely
overhead of read that is adding to drift so adjustment might be
needed.<br>
But with OA reports we just have to worry about initial timecounter
setup where we need accurate pair of system time and GPU timestamp
clock counts.<br>
I think timestamp clock is highly stable and we don't need logic to
determine frequency at runtime. Will try to get confirmation from HW
team as well.<br>
<br>
If we need to determine the frequency, Sourab's approach needs to
refined as<br>
1. It can be implemented entirely in i915 because what we need is
pair of system time and gpu clocks over different durations.<br>
2. crosstimestamp framework usage in that approach is incorrect as
ideally we should be sending ART counter and GPU counter. Instead we
were<br>
hacking to send the TSC clock.<br>
Quoting Thomas from <a class="moz-txt-link-freetext" href="https://patchwork.freedesktop.org/patch/144298/">https://patchwork.freedesktop.org/patch/144298/</a>
<br>
<address>
get_device_system_crosststamp() is for timestamps taken via a
clock which
is directly correlated with the timekeeper clocksource.
</address>
<address>ART and TSC are correlated via: TSC = (ART * scale) +
offset<br>
get_device_system_crosststamp() invokes the device function which
reads
ART, which is converted to CLOCK_MONOTONIC_RAW by the conversion
above,<br>
and
then uses interpolation to map the CLOCK_MONOTONIC_RAW value to
CLOCK_MONOTONIC.<br>
The device function does not know anything about TSC. All it knows
about is
ART.
</address>
I am not aware if GPU timestamp clock is correlated with TSC like
ART for ethernet drivers and if i915 can read ART like ethernet
drivers.<br>
3. I have seen precision issues in the calculations in
i915_perf_clock_sync_work and usage of MONO_RAW which might jump
time.<br>
<blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div>That's not to say that a more limited, simpler solution
based on frequent re-correlation wouldn't be more than
welcome if tracking an accurate frequency is too awkward
for now</div>
</div>
</div>
</div>
</blockquote>
Adjusting timecounter time can be another option if we confirm that
GPU timestamp frequency is stable.<br>
<blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>, but I think some things need to be considered in that
case:</div>
<div><br>
</div>
<div>- It would be good to quantify the kind of drift seen
in practice to know how frequently it's necessary to
re-synchronize. It sounds like you've done this ("as
verified over variable durations") so I'm curious what
kind of drift you saw. I'd imagine you would see a
significant drift over, say, one second and it might not
take much longer for the drift to even become clearly
visible to the user when plotted in a UI. For reference I
once updated the arb_timer_query test in piglit to give
some insight into this drift (<a
href="https://lists.freedesktop.org/archives/piglit/2016-September/020673.html"
moz-do-not-send="true">https://lists.freedesktop.org/archives/piglit/2016-September/020673.html</a>)
and at least from what I wrote back then it looks like I
was seeing a drift of a few milliseconds per second on
SKL. I vaguely recall it being much worse given the
frequency constants we had for Haswell.<br>
</div>
<div><br>
</div>
</div>
</div>
</div>
</blockquote>
On SKL I have seen very small drift of less than 10us over a period
of 30 minutes.<br>
Verified with changes from <a moz-do-not-send="true"
href="https://github.com/sakamble/i915-timestamp-support/commits/drm-tip">https://github.com/sakamble/i915-timestamp-support/commits/drm-tip</a><br>
<br>
36bit counter will overflow in about 95min at 12mhz and timecounter
framework considers<br>
counter value with delta from timecounter init of more than half of
total time covered by counter as time in the past so current
approach works for less than 45min.<br>
Will need to add overflow watchdog support like other drivers which
just reinitializes timecounter prior to 45min.<br>
<br>
<blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>- What guarantees will be promised about monotonicity
of correlated system timestamps? Will it be guaranteed
that sequential reports must have monotonically increasing
timestamps? That might be fiddly if the gpu + system clock
are periodically re-correlated, so it might be good to be
clear in documentation that the correlation is best-effort
only for the sake of implementation simplicity. That would
still be good for a lot of UIs I think and there's freedom
for the driver to start simple and potentially improve
later by measuring the gpu clock frequency empirically.<br>
</div>
<div><br>
</div>
</div>
</div>
</div>
</blockquote>
If we rely on timecounter alone without correlation to know
frequency, setting init time as MONOTONIC system time should take
care of monotonicity of correlated times.<br>
<br>
Regards,<br>
Sagar<br>
<blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Currently only one correlated pair of timestamps is
read when enabling the stream and so a relatively long
time is likely to pass before the stream is disabled
(seconds, minutes while a user is running a system
profiler) . It seems very likely to me that these clocks
are going to drift significantly without introducing some
form of periodic re-synchronization based on some
understanding of the drift that's seen.<br>
</div>
<div> <br>
</div>
<div>Br,</div>
<div>- Robert</div>
<div><br>
</div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<br>
This series adds base timecounter/cyclecounter changes and
changes to<br>
get GPU and CPU timestamps in OA samples.<br>
<br>
Sagar Arun Kamble (1):<br>
drm/i915/perf: Add support to correlate GPU timestamp
with system time<br>
<br>
Sourab Gupta (3):<br>
drm/i915/perf: Add support for collecting 64 bit
timestamps with OA<br>
reports<br>
drm/i915/perf: Extract raw GPU timestamps from OA
reports<br>
drm/i915/perf: Send system clock monotonic time in perf
samples<br>
<br>
drivers/gpu/drm/i915/i915_drv.<wbr>h | 11 ++++<br>
drivers/gpu/drm/i915/i915_<wbr>perf.c | 124
++++++++++++++++++++++++++++++<wbr>++++++++-<br>
drivers/gpu/drm/i915/i915_reg.<wbr>h | 6 ++<br>
include/uapi/drm/i915_drm.h | 14 +++++<br>
4 files changed, 154 insertions(+), 1 deletion(-)<br>
<span class="gmail-HOEnZb"><font color="#888888"><br>
--<br>
1.9.1<br>
<br>
______________________________<wbr>_________________<br>
Intel-gfx mailing list<br>
<a href="mailto:Intel-gfx@lists.freedesktop.org"
moz-do-not-send="true">Intel-gfx@lists.freedesktop.<wbr>org</a><br>
<a
href="https://lists.freedesktop.org/mailman/listinfo/intel-gfx"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://lists.freedesktop.org/<wbr>mailman/listinfo/intel-gfx</a><br>
</font></span></blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</body>
</html>