<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 12/7/2017 6:18 AM, Robert Bragg
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr"><br>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Wed, Nov 15, 2017 at 12:13 PM,
            Sagar Arun Kamble <span dir="ltr"><<a
                href="mailto:sagar.a.kamble@intel.com" target="_blank"
                moz-do-not-send="true">sagar.a.kamble@intel.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">We can compute system
              time corresponding to GPU timestamp by taking a<br>
              reference point (CPU monotonic time, GPU timestamp) and
              then adding<br>
              delta time computed using timecounter/cyclecounter support
              in kernel.<br>
              We have to configure cyclecounter with the GPU timestamp
              frequency.<br>
              Earlier approach that was based on cross-timestamp is not
              needed. It<br>
              was being used to approximate the frequency based on
              invalid assumptions<br>
              (possibly drift was being seen in the time due to
              precision issue).<br>
              The precision of time from GPU clocks is already in ns and
              timecounter<br>
              takes care of it as verified over variable durations.<br>
            </blockquote>
            <div><br>
            </div>
            <div>Hi Sagar,</div>
            <div><br>
            </div>
            <div>I have some doubts about this analysis...<br>
            </div>
            <div><br>
            </div>
            <div>The intent behind Sourab's original approach was to be
              able to determine the frequency at runtime empirically
              because the constants we have aren't particularly
              accurate. Without a perfectly stable frequency that's
              known very precisely then an interpolated correlation will
              inevitably drift. I think the nature of HW implies we
              can't expect to have either of those. Then the general
              idea had been to try and use existing kernel
              infrastructure for a problem which isn't unique to GPU
              clocks.</div>
          </div>
        </div>
      </div>
    </blockquote>
    Hi Robert,<br>
    <br>
    Testing on SKL shows timestamps drift only about 10us for sampling
    done in kernel for about 30min time.<br>
    Verified with changes from <a
href="https://github.com/sakamble/i915-timestamp-support/commits/drm-tip">https://github.com/sakamble/i915-timestamp-support/commits/drm-tip</a><br>
    Note that since we are sampling counter in debugfs, there is likely
    overhead of read that is adding to drift so adjustment might be
    needed.<br>
    But with OA reports we just have to worry about initial timecounter
    setup where we need accurate pair of system time and GPU timestamp
    clock counts.<br>
    I think timestamp clock is highly stable and we don't need logic to
    determine frequency at runtime. Will try to get confirmation from HW
    team as well.<br>
    <br>
    If we need to determine the frequency, Sourab's approach needs to
    refined as<br>
    1. It can be implemented entirely in i915 because what we need is
    pair of system time and gpu clocks over different durations.<br>
    2. crosstimestamp framework usage in that approach is incorrect as
    ideally we should be sending ART counter and GPU counter. Instead we
    were<br>
    hacking to send the TSC clock.<br>
    Quoting Thomas from  <a class="moz-txt-link-freetext" href="https://patchwork.freedesktop.org/patch/144298/">https://patchwork.freedesktop.org/patch/144298/</a>
    <br>
    <address>
      get_device_system_crosststamp() is for timestamps taken via a
      clock which
      is directly correlated with the timekeeper clocksource.
    </address>
    <address>ART and TSC are correlated via: TSC = (ART * scale) +
      offset<br>
      get_device_system_crosststamp() invokes the device function which
      reads
      ART, which is converted to CLOCK_MONOTONIC_RAW by the conversion
      above,<br>
      and
      then uses interpolation to map the CLOCK_MONOTONIC_RAW value to
      CLOCK_MONOTONIC.<br>
      The device function does not know anything about TSC. All it knows
      about is
      ART.
    </address>
    I am not aware if GPU timestamp clock is correlated with TSC like
    ART for ethernet drivers and if i915 can read ART like ethernet
    drivers.<br>
    3. I have seen precision issues in the calculations in
    i915_perf_clock_sync_work and usage of MONO_RAW which might jump
    time.<br>
    <blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <div>That's not to say that a more limited, simpler solution
              based on frequent re-correlation wouldn't be more than
              welcome if tracking an accurate frequency is too awkward
              for now</div>
          </div>
        </div>
      </div>
    </blockquote>
    Adjusting timecounter time can be another option if we confirm that
    GPU timestamp frequency is stable.<br>
    <blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>, but I think some things need to be considered in that
              case:</div>
            <div><br>
            </div>
            <div>- It would be good to quantify the kind of drift seen
              in practice to know how frequently it's necessary to
              re-synchronize. It sounds like you've done this ("as
              verified over variable durations") so I'm curious what
              kind of drift you saw. I'd imagine you would see a
              significant drift over, say, one second and it might not
              take much longer for the drift to even become clearly
              visible to the user when plotted in a UI. For reference I
              once updated the arb_timer_query test in piglit to give
              some insight into this drift (<a
href="https://lists.freedesktop.org/archives/piglit/2016-September/020673.html"
                moz-do-not-send="true">https://lists.freedesktop.org/archives/piglit/2016-September/020673.html</a>)
              and at least from what I wrote back then it looks like I
              was seeing a drift of a few milliseconds per second on
              SKL. I vaguely recall it being much worse given the
              frequency constants we had for Haswell.<br>
            </div>
            <div><br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    On SKL I have seen very small drift of less than 10us over a period
    of 30 minutes.<br>
    Verified with changes from <a moz-do-not-send="true"
href="https://github.com/sakamble/i915-timestamp-support/commits/drm-tip">https://github.com/sakamble/i915-timestamp-support/commits/drm-tip</a><br>
    <br>
    36bit counter will overflow in about 95min at 12mhz and timecounter
    framework considers<br>
    counter value with delta from timecounter init of more than half of
    total time covered by counter as time in the past so current
    approach works for less than 45min.<br>
    Will need to add overflow watchdog support like other drivers which
    just reinitializes timecounter prior to 45min.<br>
    <br>
    <blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>- What guarantees will be promised about monotonicity
              of correlated system timestamps? Will it be guaranteed
              that sequential reports must have monotonically increasing
              timestamps? That might be fiddly if the gpu + system clock
              are periodically re-correlated, so it might be good to be
              clear in documentation that the correlation is best-effort
              only for the sake of implementation simplicity. That would
              still be good for a lot of UIs I think and there's freedom
              for the driver to start simple and potentially improve
              later by measuring the gpu clock frequency empirically.<br>
            </div>
            <div><br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    If we rely on timecounter alone without correlation to know
    frequency, setting init time as MONOTONIC system time should take
    care of monotonicity of correlated times.<br>
    <br>
    Regards,<br>
    Sagar<br>
    <blockquote type="cite"
cite="mid:CAMou1-0pQf1k+-Hcu9mupV19DL6h7DsFm4s-g+5fZHUTL5AM1A@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>Currently only one correlated pair of timestamps is
              read when enabling the stream and so a relatively long
              time is likely to pass before the stream is disabled
              (seconds, minutes while a user is running a system
              profiler) . It seems very likely to me that these clocks
              are going to drift significantly without introducing some
              form of periodic re-synchronization based on some
              understanding of the drift that's seen.<br>
            </div>
            <div>  <br>
            </div>
            <div>Br,</div>
            <div>- Robert</div>
            <div><br>
            </div>
            <div><br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <br>
              This series adds base timecounter/cyclecounter changes and
              changes to<br>
              get GPU and CPU timestamps in OA samples.<br>
              <br>
              Sagar Arun Kamble (1):<br>
                drm/i915/perf: Add support to correlate GPU timestamp
              with system time<br>
              <br>
              Sourab Gupta (3):<br>
                drm/i915/perf: Add support for collecting 64 bit
              timestamps with OA<br>
                  reports<br>
                drm/i915/perf: Extract raw GPU timestamps from OA
              reports<br>
                drm/i915/perf: Send system clock monotonic time in perf
              samples<br>
              <br>
               drivers/gpu/drm/i915/i915_drv.<wbr>h  |  11 ++++<br>
               drivers/gpu/drm/i915/i915_<wbr>perf.c | 124
              ++++++++++++++++++++++++++++++<wbr>++++++++-<br>
               drivers/gpu/drm/i915/i915_reg.<wbr>h  |   6 ++<br>
               include/uapi/drm/i915_drm.h      |  14 +++++<br>
               4 files changed, 154 insertions(+), 1 deletion(-)<br>
              <span class="gmail-HOEnZb"><font color="#888888"><br>
                  --<br>
                  1.9.1<br>
                  <br>
                  ______________________________<wbr>_________________<br>
                  Intel-gfx mailing list<br>
                  <a href="mailto:Intel-gfx@lists.freedesktop.org"
                    moz-do-not-send="true">Intel-gfx@lists.freedesktop.<wbr>org</a><br>
                  <a
                    href="https://lists.freedesktop.org/mailman/listinfo/intel-gfx"
                    rel="noreferrer" target="_blank"
                    moz-do-not-send="true">https://lists.freedesktop.org/<wbr>mailman/listinfo/intel-gfx</a><br>
                </font></span></blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>