<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <p><br> </p> <br> <div class="moz-cite-prefix">On 12/7/2017 6:18 AM, Robert Bragg wrote:<br> </div> <blockquote type="cite" cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <div dir="ltr"><br> <div class="gmail_extra"><br> <div class="gmail_quote">On Wed, Nov 15, 2017 at 12:13 PM, Sagar Arun Kamble <span dir="ltr"><<a href="mailto:sagar.a.kamble@intel.com" target="_blank" moz-do-not-send="true">sagar.a.kamble@intel.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">We can compute system time corresponding to GPU timestamp by taking a<br> reference point (CPU monotonic time, GPU timestamp) and then adding<br> delta time computed using timecounter/cyclecounter support in kernel.<br> We have to configure cyclecounter with the GPU timestamp frequency.<br> Earlier approach that was based on cross-timestamp is not needed. It<br> was being used to approximate the frequency based on invalid assumptions<br> (possibly drift was being seen in the time due to precision issue).<br> The precision of time from GPU clocks is already in ns and timecounter<br> takes care of it as verified over variable durations.<br> </blockquote> <div><br> </div> <div>Hi Sagar,</div> <div><br> </div> <div>I have some doubts about this analysis...<br> </div> <div><br> </div> <div>The intent behind Sourab's original approach was to be able to determine the frequency at runtime empirically because the constants we have aren't particularly accurate. Without a perfectly stable frequency that's known very precisely then an interpolated correlation will inevitably drift. I think the nature of HW implies we can't expect to have either of those. Then the general idea had been to try and use existing kernel infrastructure for a problem which isn't unique to GPU clocks.</div> </div> </div> </div> </blockquote> Hi Robert,<br> <br> Testing on SKL shows timestamps drift only about 10us for sampling done in kernel for about 30min time.<br> Verified with changes from <a href="https://github.com/sakamble/i915-timestamp-support/commits/drm-tip">https://github.com/sakamble/i915-timestamp-support/commits/drm-tip</a><br> Note that since we are sampling counter in debugfs, there is likely overhead of read that is adding to drift so adjustment might be needed.<br> But with OA reports we just have to worry about initial timecounter setup where we need accurate pair of system time and GPU timestamp clock counts.<br> I think timestamp clock is highly stable and we don't need logic to determine frequency at runtime. Will try to get confirmation from HW team as well.<br> <br> If we need to determine the frequency, Sourab's approach needs to refined as<br> 1. It can be implemented entirely in i915 because what we need is pair of system time and gpu clocks over different durations.<br> 2. crosstimestamp framework usage in that approach is incorrect as ideally we should be sending ART counter and GPU counter. Instead we were<br> hacking to send the TSC clock.<br> Quoting Thomas from <a class="moz-txt-link-freetext" href="https://patchwork.freedesktop.org/patch/144298/">https://patchwork.freedesktop.org/patch/144298/</a> <br> <address> get_device_system_crosststamp() is for timestamps taken via a clock which is directly correlated with the timekeeper clocksource. </address> <address>ART and TSC are correlated via: TSC = (ART * scale) + offset<br> get_device_system_crosststamp() invokes the device function which reads ART, which is converted to CLOCK_MONOTONIC_RAW by the conversion above,<br> and then uses interpolation to map the CLOCK_MONOTONIC_RAW value to CLOCK_MONOTONIC.<br> The device function does not know anything about TSC. All it knows about is ART. </address> I am not aware if GPU timestamp clock is correlated with TSC like ART for ethernet drivers and if i915 can read ART like ethernet drivers.<br> 3. I have seen precision issues in the calculations in i915_perf_clock_sync_work and usage of MONO_RAW which might jump time.<br> <blockquote type="cite" cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <div><br> </div> <div>That's not to say that a more limited, simpler solution based on frequent re-correlation wouldn't be more than welcome if tracking an accurate frequency is too awkward for now</div> </div> </div> </div> </blockquote> Adjusting timecounter time can be another option if we confirm that GPU timestamp frequency is stable.<br> <blockquote type="cite" cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <div>, but I think some things need to be considered in that case:</div> <div><br> </div> <div>- It would be good to quantify the kind of drift seen in practice to know how frequently it's necessary to re-synchronize. It sounds like you've done this ("as verified over variable durations") so I'm curious what kind of drift you saw. I'd imagine you would see a significant drift over, say, one second and it might not take much longer for the drift to even become clearly visible to the user when plotted in a UI. For reference I once updated the arb_timer_query test in piglit to give some insight into this drift (<a href="https://lists.freedesktop.org/archives/piglit/2016-September/020673.html" moz-do-not-send="true">https://lists.freedesktop.org/archives/piglit/2016-September/020673.html</a>) and at least from what I wrote back then it looks like I was seeing a drift of a few milliseconds per second on SKL. I vaguely recall it being much worse given the frequency constants we had for Haswell.<br> </div> <div><br> </div> </div> </div> </div> </blockquote> On SKL I have seen very small drift of less than 10us over a period of 30 minutes.<br> Verified with changes from <a moz-do-not-send="true" href="https://github.com/sakamble/i915-timestamp-support/commits/drm-tip">https://github.com/sakamble/i915-timestamp-support/commits/drm-tip</a><br> <br> 36bit counter will overflow in about 95min at 12mhz and timecounter framework considers<br> counter value with delta from timecounter init of more than half of total time covered by counter as time in the past so current approach works for less than 45min.<br> Will need to add overflow watchdog support like other drivers which just reinitializes timecounter prior to 45min.<br> <br> <blockquote type="cite" cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <div>- What guarantees will be promised about monotonicity of correlated system timestamps? Will it be guaranteed that sequential reports must have monotonically increasing timestamps? That might be fiddly if the gpu + system clock are periodically re-correlated, so it might be good to be clear in documentation that the correlation is best-effort only for the sake of implementation simplicity. That would still be good for a lot of UIs I think and there's freedom for the driver to start simple and potentially improve later by measuring the gpu clock frequency empirically.<br> </div> <div><br> </div> </div> </div> </div> </blockquote> If we rely on timecounter alone without correlation to know frequency, setting init time as MONOTONIC system time should take care of monotonicity of correlated times.<br> <br> Regards,<br> Sagar<br> <blockquote type="cite" cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <div>Currently only one correlated pair of timestamps is read when enabling the stream and so a relatively long time is likely to pass before the stream is disabled (seconds, minutes while a user is running a system profiler) . It seems very likely to me that these clocks are going to drift significantly without introducing some form of periodic re-synchronization based on some understanding of the drift that's seen.<br> </div> <div> <br> </div> <div>Br,</div> <div>- Robert</div> <div><br> </div> <div><br> </div> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> This series adds base timecounter/cyclecounter changes and changes to<br> get GPU and CPU timestamps in OA samples.<br> <br> Sagar Arun Kamble (1):<br> drm/i915/perf: Add support to correlate GPU timestamp with system time<br> <br> Sourab Gupta (3):<br> drm/i915/perf: Add support for collecting 64 bit timestamps with OA<br> reports<br> drm/i915/perf: Extract raw GPU timestamps from OA reports<br> drm/i915/perf: Send system clock monotonic time in perf samples<br> <br> drivers/gpu/drm/i915/i915_drv.<wbr>h | 11 ++++<br> drivers/gpu/drm/i915/i915_<wbr>perf.c | 124 ++++++++++++++++++++++++++++++<wbr>++++++++-<br> drivers/gpu/drm/i915/i915_reg.<wbr>h | 6 ++<br> include/uapi/drm/i915_drm.h | 14 +++++<br> 4 files changed, 154 insertions(+), 1 deletion(-)<br> <span class="gmail-HOEnZb"><font color="#888888"><br> --<br> 1.9.1<br> <br> ______________________________<wbr>_________________<br> Intel-gfx mailing list<br> <a href="mailto:Intel-gfx@lists.freedesktop.org" moz-do-not-send="true">Intel-gfx@lists.freedesktop.<wbr>org</a><br> <a href="https://lists.freedesktop.org/mailman/listinfo/intel-gfx" rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.freedesktop.org/<wbr>mailman/listinfo/intel-gfx</a><br> </font></span></blockquote> </div> <br> </div> </div> </blockquote> <br> </body> </html>