<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 04/04/17 18:45, Robert Bragg wrote:<br>
    </div>
    <blockquote
cite="mid:CAMou1-0aUnyNH4GCsxooOgG1rGASYt=cSTXpLTVKKKU2rDm-rQ@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr"><br>
        <div class="gmail_extra">On Tue, Apr 4, 2017 at 10:20 AM, Lionel
          Landwerlin <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:lionel.g.landwerlin@intel.com"
              target="_blank">lionel.g.landwerlin@intel.com</a>></span>
          wrote:<br>
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div bgcolor="#FFFFFF"><span class="gmail-">
                  <div
                    class="gmail-m_-5216121689716086367moz-cite-prefix">On
                    03/04/17 21:04, Robert Bragg wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div dir="auto">
                      <div><br>
                        <div class="gmail_extra"><br>
                          <div class="gmail_quote">On Mar 30, 2017
                            16:16, "Lionel Landwerlin" <<a
                              moz-do-not-send="true"
                              href="mailto:lionel.g.landwerlin@intel.com"
                              target="_blank">lionel.g.landwerlin@intel.com</a><wbr>>
                            wrote:<br type="attribution">
                            <blockquote
                              class="gmail-m_-5216121689716086367quote"
                              style="margin:0px 0px 0px
                              0.8ex;border-left:1px solid
                              rgb(204,204,204);padding-left:1ex">While
                              exercising reading report with moderate
                              load, we might have to<br>
                              wait for all the reports to land in the OA
                              buffer, otherwise we might<br>
                              miss some reports. That means we need to
                              keep on reading the OA stream<br>
                              until the last report we read has a
                              timestamp older that the timestamp<br>
                              recorded by the MI_REPORT_PERF_COUNT at
                              the end of the performance<br>
                              query.<br>
                            </blockquote>
                          </div>
                        </div>
                      </div>
                      <div dir="auto"><br>
                      </div>
                      <div dir="auto">The expectation is that since we
                        have received the second MI_REPORT_PERF_COUNT
                        report that implies any periodic/automatic
                        reports that the could possibly relate to a
                        query must have been written to the OABUFFER.
                        Since we read() as much as is available from
                        i915 perf when we come to finish processing a
                        query then we shouldn't miss anything.</div>
                      <div dir="auto"><br>
                      </div>
                      <div dir="auto">We shouldn't synchronously wait in
                        a busy loop until we've found a report that has
                        a timestamp >= our end MI_RPC because that
                        could be a really significant amount of time
                        (e.g. 50+ milliseconds on HSW). NB. the periodic
                        samples are just to account for counter
                        overflow. On Gen8+ the loop would likely block
                        until the next context switch happens.</div>
                    </div>
                  </blockquote>
                  <br>
                </span> Right, so if we don't get all the reports until
                the second MI_REPORT_PERF_COUNT, we might be dealing
                with a kernel issue.<br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>Okey, so the realisation we've come to talking about
              this offline is that the kernel's handling of the OA
              unit's tail pointer race condition breaks this assumption
              :-/<br>
              <br>
            </div>
            <div>Even though any necessary reports must have landed in
              the OABUFFER, the kernel will throttle/delay access to the
              most recent reports in case the HW tail pointer has got
              ahead of what's visible to the CPU in memory.<br>
              <br>
            </div>
            <div>To handle this without blocking in a busy loop as above
              I guess we probably need to drive our i915 perf reads()
              when the application asks if a query's results are ready
              via brw_is_perf_query_ready().<br>
              <br>
              It also isn't enough for brw_wait_perf_query to rely on
              drm_intel_bo_wait_rendering() with the MI_RPC bo.<br>
              <br>
            </div>
            <div>We might want to bump the periodic sampling frequency
              to a small multiple of 5 milliseconds to at least reduce
              the worst-case latency for results becoming ready. It
              wouldn't be that worthwhile sampling with a higher
              frequency since the kernel only does tail pointer age
              checks at 200Hz currently.<br>
            </div>
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div bgcolor="#FFFFFF"> <br>
                I'm still suspicious of the code in mesa. I think in
                some cases (in particular with many applications
                running) it might loop forever.<br>
              </div>
            </blockquote>
            <div><br>
            </div>
            <div>Hope it's ok, but maybe there's an issue, can you maybe
              elaborate on what details looks suspicious to you?<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Missed your question :/<br>
    <br>
    Basically we might end up reading forever if too many context
    switches happen.<br>
    (experienced with igt with a bunch of other applications using the
    GPU)<br>
    <br>
    It feels like another good argument for checking the timestamps.<br>
    <br>
    <blockquote
cite="mid:CAMou1-0aUnyNH4GCsxooOgG1rGASYt=cSTXpLTVKKKU2rDm-rQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> <br>
            </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div bgcolor="#FFFFFF"> <br>
                Also the igt tests using roughly the same loop while
                (read() && errno != EINTR) are pretty reliable
                at not finding all the reports we need from to the
                stream.<br>
                So I would expect the same thing from mesa.</div>
            </blockquote>
            <div><br>
            </div>
            <div>Right, I wrote the igt code first and when that looked
              to be working I ported it to Mesa.<br>
              <br>
            </div>
            <div>The issue with the tail pointer race handling by the
              kernel sounds like a likely explanation for seeing issues
              here in both mesa and igt.<br>
              <br>
            </div>
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              <div bgcolor="#FFFFFF">
                <div>
                  <div class="gmail-h5"><br>
                    <br>
                    <blockquote type="cite">
                      <div dir="auto">
                        <div dir="auto"><br>
                        </div>
                        <div dir="auto">Was this proposal based on a
                          code inspection or did you maybe hit a problem
                          correctly measuring something?</div>
                        <div dir="auto"><br>
                        </div>
                        <div dir="auto">Maybe if based on inspection we
                          can find somewhere to put a comment to clarify
                          the assumptions above?</div>
                        <div dir="auto"><br>
                        </div>
                        <div dir="auto">Br,</div>
                        <div dir="auto">- Robert</div>
                        <div dir="auto"><br>
                        </div>
                        <div dir="auto">
                          <div class="gmail_extra">
                            <div class="gmail_quote">
                              <blockquote
                                class="gmail-m_-5216121689716086367quote"
                                style="margin:0px 0px 0px
                                0.8ex;border-left:1px solid
                                rgb(204,204,204);padding-left:1ex"> <br>
                                Signed-off-by: Lionel Landwerlin <<a
                                  moz-do-not-send="true"
                                  href="mailto:lionel.g.landwerlin@intel.com"
                                  target="_blank">lionel.g.landwerlin@intel.com</a><wbr>><br>
                                Cc: Robert Bragg <<a
                                  moz-do-not-send="true"
                                  href="mailto:robert@sixbynine.org"
                                  target="_blank">robert@sixbynine.org</a>><br>
                                ---<br>
                                 src/mesa/drivers/dri/i965/<wbr>brw_performance_query.c
                                | 51 ++++++++++++++---------<br>
                                 1 file changed, 32 insertions(+), 19
                                deletions(-)<br>
                                <br>
                                diff --git
                                a/src/mesa/drivers/dri/i965/br<wbr>w_performance_query.c
                                b/src/mesa/drivers/dri/i965/br<wbr>w_performance_query.c<br>
                                index 4a94e4b3cc..076c59e633 100644<br>
                                --- a/src/mesa/drivers/dri/i965/br<wbr>w_performance_query.c<br>
                                +++ b/src/mesa/drivers/dri/i965/br<wbr>w_performance_query.c<br>
                                @@ -647,38 +647,50 @@
                                discard_all_queries(struct brw_context
                                *brw)<br>
                                 }<br>
                                <br>
                                 static bool<br>
                                -read_oa_samples(struct brw_context
                                *brw)<br>
                                +read_oa_samples_until(struct
                                brw_context *brw,<br>
                                +                      uint32_t
                                start_timestamp,<br>
                                +                      uint32_t
                                end_timestamp)<br>
                                 {<br>
                                -   while (1) {<br>
                                +   uint32_t last_timestamp =
                                start_timestamp;<br>
                                +<br>
                                +   while (last_timestamp <
                                end_timestamp) {<br>
                                       struct brw_oa_sample_buf *buf =
                                get_free_sample_buf(brw);<br>
                                +      uint32_t offset;<br>
                                       int len;<br>
                                <br>
                                       while ((len =
                                read(brw->perfquery.oa_stream_<wbr>fd,
                                buf->buf,<br>
                                -                       
                                 sizeof(buf->buf))) < 0 &&
                                errno == EINTR)<br>
                                +                       
                                 sizeof(buf->buf))) < 0 &&<br>
                                +             (errno == EINTR || errno
                                == EAGAIN))<br>
                                          ;<br>
                                <br>
                                       if (len <= 0) {<br>
                                         
                                exec_list_push_tail(&brw->perf<wbr>query.free_sample_buffers,
                                &buf->link);<br>
                                <br>
                                -         if (len < 0) {<br>
                                -            if (errno == EAGAIN)<br>
                                -               return true;<br>
                                -            else {<br>
                                -               DBG("Error reading i915
                                perf samples: %m\n");<br>
                                -               return false;<br>
                                -            }<br>
                                -         } else {<br>
                                +         if (len < 0)<br>
                                +            DBG("Error reading i915
                                perf samples: %m\n");<br>
                                +         else<br>
                                             DBG("Spurious EOF reading
                                i915 perf samples\n");<br>
                                -            return false;<br>
                                -         }<br>
                                +<br>
                                +         return false;<br>
                                       }<br>
                                <br>
                                       buf->len = len;<br>
                                     
                                 exec_list_push_tail(&brw->per<wbr>fquery.sample_buffers,
                                &buf->link);<br>
                                +<br>
                                +      /* Go through the reports and
                                update the last timestamp. */<br>
                                +      while (offset < buf->len) {<br>
                                +         const struct
                                drm_i915_perf_record_header *header =<br>
                                +            (const struct
                                drm_i915_perf_record_header *)
                                &buf->buf[offset];<br>
                                +         uint32_t *report = (uint32_t
                                *) (header + 1);<br>
                                +<br>
                                +         if (header->type ==
                                DRM_I915_PERF_RECORD_SAMPLE)<br>
                                +            last_timestamp = report[1];<br>
                                +<br>
                                +         offset += header->size;<br>
                                +      }<br>
                                    }<br>
                                <br>
                                -   unreachable("not reached");<br>
                                -   return false;<br>
                                +   return true;<br>
                                 }<br>
                                <br>
                                 /**<br>
                                @@ -709,10 +721,6 @@
                                accumulate_oa_reports(struct brw_context
                                *brw,<br>
                                <br>
                                    assert(o->Ready);<br>
                                <br>
                                -   /* Collect the latest periodic OA
                                reports from i915 perf */<br>
                                -   if (!read_oa_samples(brw))<br>
                                -      goto error;<br>
                                -<br>
                                    drm_intel_bo_map(obj-><a
                                  moz-do-not-send="true"
                                  href="http://oa.bo" rel="noreferrer"
                                  target="_blank">oa.bo</a>, false);<br>
                                    query_buffer =
                                obj->oa.bo->virtual;<br>
                                <br>
                                @@ -728,6 +736,11 @@
                                accumulate_oa_reports(struct brw_context
                                *brw,<br>
                                       goto error;<br>
                                    }<br>
                                <br>
                                +   /* Collect the latest periodic OA
                                reports from i915 perf */<br>
                                +   if (!read_oa_samples_until(brw,
                                start[1], end[1]))<br>
                                +      goto error;<br>
                                +<br>
                                +<br>
                                    /* See if we have any periodic
                                reports to accumulate too... */<br>
                                <br>
                                    /* N.B. The oa.samples_head was set
                                when the query began and<br>
                                <font color="#888888">--<br>
                                  2.11.0<br>
                                  <br>
                                </font></blockquote>
                            </div>
                            <br>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                    <p><br>
                    </p>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>