<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    On 5/27/2021 01:53, Tvrtko Ursulin wrote:<br>
    <blockquote type="cite"
      cite="mid:018044c0-d238-2060-99a4-a477d225781e@linux.intel.com">
      On 26/05/2021 19:45, John Harrison wrote:
      <br>
      <blockquote type="cite">On 5/26/2021 01:40, Tvrtko Ursulin wrote:
        <br>
        <blockquote type="cite">On 25/05/2021 18:52, Matthew Brost
          wrote:
          <br>
          <blockquote type="cite">On Tue, May 25, 2021 at 11:16:12AM
            +0100, Tvrtko Ursulin wrote:
            <br>
            <blockquote type="cite">
              <br>
              On 06/05/2021 20:14, Matthew Brost wrote:
              <br>
              <blockquote type="cite">From: John Harrison
                <a class="moz-txt-link-rfc2396E" href="mailto:John.C.Harrison@Intel.com"><John.C.Harrison@Intel.com></a>
                <br>
                <br>
                The serial number tracking of engines happens at the
                backend of
                <br>
                request submission and was expecting to only be given
                physical
                <br>
                engines. However, in GuC submission mode, the
                decomposition of virtual
                <br>
                to physical engines does not happen in i915. Instead,
                requests are
                <br>
                submitted to their virtual engine mask all the way
                through to the
                <br>
                hardware (i.e. to GuC). This would mean that the heart
                beat code
                <br>
                thinks the physical engines are idle due to the serial
                number not
                <br>
                incrementing.
                <br>
                <br>
                This patch updates the tracking to decompose virtual
                engines into
                <br>
                their physical constituents and tracks the request
                against each. This
                <br>
                is not entirely accurate as the GuC will only be issuing
                the request
                <br>
                to one physical engine. However, it is the best that
                i915 can do given
                <br>
                that it has no knowledge of the GuC's scheduling
                decisions.
                <br>
              </blockquote>
              <br>
              Commit text sounds a bit defeatist. I think instead of
              making up the serial
              <br>
              counts, which has downsides (could you please document in
              the commit what
              <br>
              they are), we should think how to design things properly.
              <br>
              <br>
            </blockquote>
            <br>
            IMO, I don't think fixing serial counts is the scope of this
            series. We
            <br>
            should focus on getting GuC submission in not cleaning up
            all the crap
            <br>
            that is in the i915. Let's make a note of this though so we
            can revisit
            <br>
            later.
            <br>
          </blockquote>
          <br>
          I will say again - commit message implies it is introducing an
          unspecified downside by not fully fixing an also unspecified
          issue. It is completely reasonable, and customary even, to ask
          for both to be documented in the commit message.
          <br>
        </blockquote>
        Not sure what exactly is 'unspecified'. I thought the commit
        message described both the problem (heartbeat not running when
        using virtual engines) and the result (heartbeat running on more
        engines than strictly necessary). But in greater detail...
        <br>
        <br>
        The serial number tracking is a hack for the heartbeat code to
        know whether an engine is busy or idle, and therefore whether it
        should be pinged for aliveness. Whenever a submission is made to
        an engine, the serial number is incremented. The heartbeat code
        keeps a copy of the value. If the value has changed, the engine
        is busy and needs to be pinged.
        <br>
        <br>
        This works fine for execlist mode where virtual engine
        decomposition is done inside i915. It fails miserably for GuC
        mode where the decomposition is done by the hardware. The reason
        being that the heartbeat code only looks at physical engines but
        the serial count is only incremented on the virtual engine.
        Thus, the heartbeat sees everything as idle and does not ping.
        <br>
      </blockquote>
      <br>
      So hangcheck does not work. Or it works because GuC does it
      anyway. Either way, that's one thing to explicitly state in the
      commit message.
      <br>
      <br>
      <blockquote type="cite">This patch decomposes the virtual engines
        for the sake of incrementing the serial count on each sub-engine
        in order to keep the heartbeat code happy. The downside is that
        now the heartbeat sees all sub-engines as busy rather than only
        the one the submission actually ends up on. There really isn't
        much that can be done about that. The heartbeat code is in i915
        not GuC, the scheduler is in GuC not i915. The only way to
        improve it is to either move the heartbeat code into GuC as well
        and completely disable the i915 side, or add some way for i915
        to interrogate GuC as to which engines are or are not active.
        Technically, we do have both. GuC has (or at least had) an
        option to force a context switch on every execution quantum
        pre-emption. However, that is much, much, more heavy weight than
        the heartbeat. For the latter, we do (almost) have the engine
        usage statistics for PMU and such like. I'm not sure how much
        effort it would be to wire that up to the heartbeat code instead
        of using the serial count.
        <br>
        <br>
        In short, the serial count is ever so slightly inefficient in
        that it causes heartbeat pings on engines which are idle. On the
        other hand, it is way more efficient and simpler than the
        current alternatives.
        <br>
      </blockquote>
      <br>
      And the hack to make hangcheck work creates this inefficiency
      where heartbeats are sent to idle engines. Which is probably fine
      just needs to be explained.
      <br>
      <br>
      <blockquote type="cite">Does that answer the questions?
        <br>
      </blockquote>
      <br>
      With the two points I re-raise clearly explained, possibly even
      patch title changed, yeah. I am just wanting for it to be more
      easily obvious to patch reader what it is functionally about - not
      just what implementation details have been change but why as well.
      <br>
      <br>
    </blockquote>
    My understanding is that we don't explain every piece of code in
    minute detail in every checkin email that touches it. I thought my
    description was already pretty verbose. I've certainly seen way less
    informative checkins that apparently made it through review without
    issue.<br>
    <br>
    Regarding the problem statement, I thought this was fairly clear
    that the heartbeat was broken for virtual engines:<br>
    <blockquote>This would mean that the heart beat code
      <br>
      thinks the physical engines are idle due to the serial number not
      <br>
      incrementing.
      <br>
    </blockquote>
    <br>
    Regarding the inefficiency about heartbeating all physical engines
    in a virtual engine, again, this seems clear to me:<br>
    <blockquote>decompose virtual engines into
      <br>
      their physical constituents and tracks the request against each.
      This
      <br>
      is not entirely accurate as the GuC will only be issuing the
      request
      <br>
      to one physical engine.<br>
    </blockquote>
    <br>
    For the subject, I guess you could say "Track 'heartbeat serial'
    counts for virtual engines". However, the serial tracking count is
    not explicitly named for heartbeats so it seems inaccurate to rename
    it for a checkin email subject.<br>
    <br>
    If you have a suggestion for better wording then feel free to
    propose something.<br>
    <br>
    John.<br>
    <br>
    <br>
    <blockquote type="cite"
      cite="mid:018044c0-d238-2060-99a4-a477d225781e@linux.intel.com">Regards,
      <br>
      <br>
      Tvrtko
      <br>
      <br>
      <blockquote type="cite">John.
        <br>
        <br>
        <br>
        <blockquote type="cite">
          <br>
          If we are abandoning the normal review process someone please
          say so I don't waste my time reading it.
          <br>
          <br>
          Regards,
          <br>
          <br>
          Tvrtko
          <br>
          <br>
          <blockquote type="cite">Matt
            <br>
            <br>
            <blockquote type="cite">Regards,
              <br>
              <br>
              Tvrtko
              <br>
              <br>
              <blockquote type="cite">Signed-off-by: John Harrison
                <a class="moz-txt-link-rfc2396E" href="mailto:John.C.Harrison@Intel.com"><John.C.Harrison@Intel.com></a>
                <br>
                Signed-off-by: Matthew Brost
                <a class="moz-txt-link-rfc2396E" href="mailto:matthew.brost@intel.com"><matthew.brost@intel.com></a>
                <br>
                ---
                <br>
                   drivers/gpu/drm/i915/gt/intel_engine_types.h     |  2
                ++
                <br>
                   .../gpu/drm/i915/gt/intel_execlists_submission.c |  6
                ++++++
                <br>
                   drivers/gpu/drm/i915/gt/intel_ring_submission.c  |  6
                ++++++
                <br>
                   drivers/gpu/drm/i915/gt/mock_engine.c            |  6
                ++++++
                <br>
                   .../gpu/drm/i915/gt/uc/intel_guc_submission.c    | 16
                ++++++++++++++++
                <br>
                   drivers/gpu/drm/i915/i915_request.c              |  4
                +++-
                <br>
                   6 files changed, 39 insertions(+), 1 deletion(-)
                <br>
                <br>
                diff --git
                a/drivers/gpu/drm/i915/gt/intel_engine_types.h
                b/drivers/gpu/drm/i915/gt/intel_engine_types.h
                <br>
                index 86302e6d86b2..e2b5cda6dbc4 100644
                <br>
                --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
                <br>
                +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
                <br>
                @@ -389,6 +389,8 @@ struct intel_engine_cs {
                <br>
                       void        (*park)(struct intel_engine_cs
                *engine);
                <br>
                       void        (*unpark)(struct intel_engine_cs
                *engine);
                <br>
                +    void        (*bump_serial)(struct intel_engine_cs
                *engine);
                <br>
                +
                <br>
                       void        (*set_default_submission)(struct
                intel_engine_cs *engine);
                <br>
                       const struct intel_context_ops *cops;
                <br>
                diff --git
                a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
                b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
                <br>
                index ae12d7f19ecd..02880ea5d693 100644
                <br>
                ---
                a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
                <br>
                +++
                b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
                <br>
                @@ -3199,6 +3199,11 @@ static void
                execlists_release(struct intel_engine_cs *engine)
                <br>
                       lrc_fini_wa_ctx(engine);
                <br>
                   }
                <br>
                +static void execlist_bump_serial(struct intel_engine_cs
                *engine)
                <br>
                +{
                <br>
                +    engine->serial++;
                <br>
                +}
                <br>
                +
                <br>
                   static void
                <br>
                   logical_ring_default_vfuncs(struct intel_engine_cs
                *engine)
                <br>
                   {
                <br>
                @@ -3208,6 +3213,7 @@ logical_ring_default_vfuncs(struct
                intel_engine_cs *engine)
                <br>
                       engine->cops = &execlists_context_ops;
                <br>
                       engine->request_alloc =
                execlists_request_alloc;
                <br>
                +    engine->bump_serial = execlist_bump_serial;
                <br>
                       engine->reset.prepare =
                execlists_reset_prepare;
                <br>
                       engine->reset.rewind = execlists_reset_rewind;
                <br>
                diff --git
                a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
                b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
                <br>
                index 14aa31879a37..39dd7c4ed0a9 100644
                <br>
                --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
                <br>
                +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
                <br>
                @@ -1045,6 +1045,11 @@ static void setup_irq(struct
                intel_engine_cs *engine)
                <br>
                       }
                <br>
                   }
                <br>
                +static void ring_bump_serial(struct intel_engine_cs
                *engine)
                <br>
                +{
                <br>
                +    engine->serial++;
                <br>
                +}
                <br>
                +
                <br>
                   static void setup_common(struct intel_engine_cs
                *engine)
                <br>
                   {
                <br>
                       struct drm_i915_private *i915 = engine->i915;
                <br>
                @@ -1064,6 +1069,7 @@ static void setup_common(struct
                intel_engine_cs *engine)
                <br>
                       engine->cops = &ring_context_ops;
                <br>
                       engine->request_alloc = ring_request_alloc;
                <br>
                +    engine->bump_serial = ring_bump_serial;
                <br>
                       /*
                <br>
                        * Using a global execution timeline; the
                previous final breadcrumb is
                <br>
                diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c
                b/drivers/gpu/drm/i915/gt/mock_engine.c
                <br>
                index bd005c1b6fd5..97b10fd60b55 100644
                <br>
                --- a/drivers/gpu/drm/i915/gt/mock_engine.c
                <br>
                +++ b/drivers/gpu/drm/i915/gt/mock_engine.c
                <br>
                @@ -292,6 +292,11 @@ static void
                mock_engine_release(struct intel_engine_cs *engine)
                <br>
                       intel_engine_fini_retire(engine);
                <br>
                   }
                <br>
                +static void mock_bump_serial(struct intel_engine_cs
                *engine)
                <br>
                +{
                <br>
                +    engine->serial++;
                <br>
                +}
                <br>
                +
                <br>
                   struct intel_engine_cs *mock_engine(struct
                drm_i915_private *i915,
                <br>
                                       const char *name,
                <br>
                                       int id)
                <br>
                @@ -318,6 +323,7 @@ struct intel_engine_cs
                *mock_engine(struct drm_i915_private *i915,
                <br>
                       engine->base.cops = &mock_context_ops;
                <br>
                       engine->base.request_alloc =
                mock_request_alloc;
                <br>
                +    engine->base.bump_serial = mock_bump_serial;
                <br>
                       engine->base.emit_flush = mock_emit_flush;
                <br>
                       engine->base.emit_fini_breadcrumb =
                mock_emit_breadcrumb;
                <br>
                       engine->base.submit_request =
                mock_submit_request;
                <br>
                diff --git
                a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
                b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
                <br>
                index dc79d287c50a..f0e5731bcef6 100644
                <br>
                --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
                <br>
                +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
                <br>
                @@ -1500,6 +1500,20 @@ static void guc_release(struct
                intel_engine_cs *engine)
                <br>
                       lrc_fini_wa_ctx(engine);
                <br>
                   }
                <br>
                +static void guc_bump_serial(struct intel_engine_cs
                *engine)
                <br>
                +{
                <br>
                +    engine->serial++;
                <br>
                +}
                <br>
                +
                <br>
                +static void virtual_guc_bump_serial(struct
                intel_engine_cs *engine)
                <br>
                +{
                <br>
                +    struct intel_engine_cs *e;
                <br>
                +    intel_engine_mask_t tmp, mask = engine->mask;
                <br>
                +
                <br>
                +    for_each_engine_masked(e, engine->gt, mask, tmp)
                <br>
                +        e->serial++;
                <br>
                +}
                <br>
                +
                <br>
                   static void guc_default_vfuncs(struct intel_engine_cs
                *engine)
                <br>
                   {
                <br>
                       /* Default vfuncs which can be overridden by each
                engine. */
                <br>
                @@ -1508,6 +1522,7 @@ static void
                guc_default_vfuncs(struct intel_engine_cs *engine)
                <br>
                       engine->cops = &guc_context_ops;
                <br>
                       engine->request_alloc = guc_request_alloc;
                <br>
                +    engine->bump_serial = guc_bump_serial;
                <br>
                       engine->sched_engine->schedule =
                i915_schedule;
                <br>
                @@ -1843,6 +1858,7 @@ guc_create_virtual(struct
                intel_engine_cs **siblings, unsigned int count)
                <br>
                       ve->base.cops = &virtual_guc_context_ops;
                <br>
                       ve->base.request_alloc = guc_request_alloc;
                <br>
                +    ve->base.bump_serial = virtual_guc_bump_serial;
                <br>
                       ve->base.submit_request = guc_submit_request;
                <br>
                diff --git a/drivers/gpu/drm/i915/i915_request.c
                b/drivers/gpu/drm/i915/i915_request.c
                <br>
                index 9542a5baa45a..127d60b36422 100644
                <br>
                --- a/drivers/gpu/drm/i915/i915_request.c
                <br>
                +++ b/drivers/gpu/drm/i915/i915_request.c
                <br>
                @@ -692,7 +692,9 @@ bool __i915_request_submit(struct
                i915_request *request)
                <br>
                                        request->ring->vaddr +
                request->postfix);
                <br>
                       trace_i915_request_execute(request);
                <br>
                -    engine->serial++;
                <br>
                +    if (engine->bump_serial)
                <br>
                +        engine->bump_serial(engine);
                <br>
                +
                <br>
                       result = true;
                <br>
                       GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE,
                &request->fence.flags));
                <br>
                <br>
              </blockquote>
            </blockquote>
          </blockquote>
          _______________________________________________
          <br>
          Intel-gfx mailing list
          <br>
          <a class="moz-txt-link-abbreviated" href="mailto:Intel-gfx@lists.freedesktop.org">Intel-gfx@lists.freedesktop.org</a>
          <br>
          <a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/intel-gfx">https://lists.freedesktop.org/mailman/listinfo/intel-gfx</a>
          <br>
        </blockquote>
        <br>
      </blockquote>
    </blockquote>
    <br>
  </body>
</html>