<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>BTW, this also seems to be what breaks suspend/resume.<br>
    </p>
    <p><br>
    </p>
    <p>Andrey<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 09/21/2018 01:56 PM, Andrey
      Grodzovsky wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:681ddd4e-6bd2-db28-4286-2cc577d0f00a@amd.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <p>No worries, I will just revert locally until then to clear the
        extra errors during my investigation of current GPU reset status
        and issues.</p>
      <p><br>
      </p>
      <p>Andrey<br>
      </p>
      <br>
      <div class="moz-cite-prefix">On 09/21/2018 01:53 PM, Christian
        König wrote:<br>
      </div>
      <blockquote type="cite"
        cite="mid:04944e7b-044b-4b16-3d2f-e760eedcee9a@gmail.com">
        <div class="moz-cite-prefix">I unfortunately don't have a
          Polaris to test this myself.<br>
          <br>
          But please give me time till Monday so that I can at least try
          one more things to fix it.<br>
          <br>
          Christian.<br>
          <br>
          Am 21.09.2018 um 19:11 schrieb Andrey Grodzovsky:<br>
        </div>
        <blockquote type="cite"
          cite="mid:c81338de-5fc7-3be3-961a-bba0eba05351@amd.com">
          <p>Ping...</p>
          <p><br>
          </p>
          <p>Andrey<br>
          </p>
          <br>
          <div class="moz-cite-prefix">On 09/20/2018 04:35 PM, Andrey
            Grodzovsky wrote:<br>
          </div>
          <blockquote type="cite"
            cite="mid:4afeb01c-37e9-ca76-8055-5dd15fca98d3@amd.com">
            <p>What's the status with this error and the suggested patch
              to fix it ? It impacts GPU reset on Polaris11.</p>
            <p>Do we want to investigate why the original patch breaks
              it or just disable with the proposed patch ?</p>
            <p><br>
            </p>
            <p>P.S Suspend resume also stopped working on latest branch
              - will bisect it later today or tomorrow.</p>
            <p><br>
            </p>
            <p>Andrey<br>
            </p>
            <br>
            <div class="moz-cite-prefix">On 09/18/2018 11:00 AM,
              Christian König wrote:<br>
            </div>
            <blockquote type="cite"
              cite="mid:edd44be9-2ef3-3c39-3342-5d3b4bbfa40a@amd.com">
              <div class="moz-cite-prefix">Tom,<br>
                <br>
                can you try if the following makes it working again?<br>
                <br>
                diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
                b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c<br>
                index b6160de70d12..d65f5ba92fc5 100644<br>
                --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c<br>
                +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c<br>
                @@ -937,6 +937,10 @@ static int
                gfx_v8_0_ring_test_ib(struct amdgpu_ring *ring, long
                timeout)<br>
                        return r;<br>
                 }<br>
                 <br>
                +static int gfx_v8_0_kiq_ring_test_ib(struct amdgpu_ring
                *ring, long timeout)<br>
                +{<br>
                +       return 0;<br>
                +}<br>
                 <br>
                 static void gfx_v8_0_free_microcode(struct
                amdgpu_device *adev)<br>
                 {<br>
                @@ -7174,7 +7178,7 @@ static const struct
                amdgpu_ring_funcs gfx_v8_0_ring_funcs_kiq = {<br>
                        .emit_ib = gfx_v8_0_ring_emit_ib_compute,<br>
                        .emit_fence = gfx_v8_0_ring_emit_fence_kiq,<br>
                        .test_ring = gfx_v8_0_ring_test_ring,<br>
                -       .test_ib = gfx_v8_0_ring_test_ib,<br>
                +       .test_ib = gfx_v8_0_kiq_ring_test_ib,<br>
                        .insert_nop = amdgpu_ring_insert_nop,<br>
                        .pad_ib = amdgpu_ring_generic_pad_ib,<br>
                        .emit_rreg = gfx_v8_0_ring_emit_rreg,<br>
                <br>
                <br>
                Thanks,<br>
                Christian.<br>
                <br>
                Am 18.09.2018 um 16:41 schrieb Christian König:<br>
              </div>
              <blockquote type="cite"
                cite="mid:4a250398-d2ac-1650-739d-e4a6598f1c48@gmail.com">
                <div class="moz-cite-prefix">CRTC and GFX interrupts
                  seem to be working perfectly fine.<br>
                  <br>
                  The problem here looks like only EOP interrupts from
                  the Compute queue are not correctly handled.<br>
                  <br>
                  Most likely a bug somewhere in gfx_v8_0_eop_irq().<br>
                  <br>
                  Christian.<br>
                  <br>
                  Am 18.09.2018 um 16:36 schrieb Deucher, Alexander:<br>
                </div>
                <blockquote type="cite"
cite="mid:BN6PR12MB1809B0E02DDA1E8AACFFD1DAF71D0@BN6PR12MB1809.namprd12.prod.outlook.com">
                  <style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
                  <div id="divtagdefaultwrapper"
style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;"
                    dir="ltr">
                    <p style="margin-top:0;margin-bottom:0">FWIW, a
                      number of consumer Raven boards have bad IVRS
                      tables (windows doesn't use interrupt remapping so
                      they are sometimes wrong and probably not
                      validated.  There are a number of workaround to
                      manually override the IVRS tables to make
                      interrupts work.  I think specifying pci=noacpi is
                      also a possible workaround.</p>
                    <p style="margin-top:0;margin-bottom:0"><br>
                    </p>
                    <p style="margin-top:0;margin-bottom:0">Alex<br>
                    </p>
                  </div>
                  <hr style="display:inline-block;width:98%"
                    tabindex="-1">
                  <div id="divRplyFwdMsg" dir="ltr"><font
                      style="font-size:11pt" face="Calibri, sans-serif"
                      color="#000000"><b>From:</b> amd-gfx <a
                        class="moz-txt-link-rfc2396E"
                        href="mailto:amd-gfx-bounces@lists.freedesktop.org"
                        moz-do-not-send="true"><amd-gfx-bounces@lists.freedesktop.org></a>
                      on behalf of Christian König <a
                        class="moz-txt-link-rfc2396E"
                        href="mailto:christian.koenig@amd.com"
                        moz-do-not-send="true"><christian.koenig@amd.com></a><br>
                      <b>Sent:</b> Tuesday, September 18, 2018 10:31:16
                      AM<br>
                      <b>To:</b> StDenis, Tom; amd-gfx mailing list;
                      Zhou, David(ChunMing)<br>
                      <b>Subject:</b> Re: Regression on gfx8 with ring
                      init</font>
                    <div> </div>
                  </div>
                  <div class="BodyFragment"><font size="2"><span
                        style="font-size:11pt;">
                        <div class="PlainText">Well looks like interrupt
                          processing is working perfectly fine.<br>
                          <br>
                          But looking at the error message once more I
                          see that this actually <br>
                          affects ring number 9 and not the GFX ring.<br>
                          <br>
                          Can you fix amdgpu_ib_ring_tests() to print
                          ring->name instead of the <br>
                          number?<br>
                          <br>
                          That must be some of the compute rings.<br>
                          <br>
                          Thanks,<br>
                          Christian.<br>
                          <br>
                          Am 18.09.2018 um 16:20 schrieb Tom St Denis:<br>
                          > On 2018-09-18 10:13 a.m., Christian König
                          wrote:<br>
                          >> Mhm, there is no more failed IB-test
                          in there isn't it?<br>
                          ><br>
                          > oh sorry I thought you wanted to test
                          HEAD~ ... Attached is a log from <br>
                          > the tip of drm-next<br>
                          ><br>
                          > Tom<br>
                          ><br>
                          >><br>
                          >> Christian.<br>
                          >><br>
                          >> Am 18.09.2018 um 16:09 schrieb Tom St
                          Denis:<br>
                          >>> Disabling IOMMU in the BIOS
                          resulted in a correct boot up...<br>
                          >>><br>
                          >>> Here's the log.<br>
                          >>><br>
                          >>> Tom<br>
                          >>><br>
                          >>> On 2018-09-18 9:58 a.m., Tom St
                          Denis wrote:<br>
                          >>>> Odd I couldn't even boot my
                          system with the dGPU as primary after <br>
                          >>>> rebuilding the kernel.  It
                          got hung up in the IOMMU driver (loads <br>
                          >>>> of AMD-Vi IOMMU errors) which
                          I wasn't able to capture because it <br>
                          >>>> panic'ed before loading the
                          network stack.<br>
                          >>>><br>
                          >>>> Bizarre.<br>
                          >>>><br>
                          >>>> I'll keep trying.<br>
                          >>>><br>
                          >>>> Tom<br>
                          >>>><br>
                          >>>> On 2018-09-18 9:35 a.m.,
                          Christian König wrote:<br>
                          >>>>> Am 18.09.2018 um 15:32
                          schrieb Tom St Denis:<br>
                          >>>>>> On 2018-09-18 9:30
                          a.m., Christian König wrote:<br>
                          >>>>>>> Great, not sure
                          if that is a good or a bad news.<br>
                          >>>>>>><br>
                          >>>>>>> Anyway going to
                          revert the change for now. Does anybody <br>
                          >>>>>>> volunteer to
                          figure out why interrupts sometimes doesn't
                          work <br>
                          >>>>>>> correctly on
                          Raven?<br>
                          >>>>>><br>
                          >>>>>> What does "doesn't
                          work correctly?"  My workstation is a Raven1 <br>
                          >>>>>> (Ryzen 2400G) and
                          other than the TTM bulk move issue has been <br>
                          >>>>>> perfectly stable
                          (through suspend/resumes too I might add).<br>
                          >>>>>><br>
                          >>>>>> Anything I could test
                          with my devel raven?<br>
                          >>>>><br>
                          >>>>> The problem seems to be
                          that on some boards IH handling doesn't <br>
                          >>>>> work as it should.<br>
                          >>>>><br>
                          >>>>> Can you try to disable
                          the onboard graphics and try again?<br>
                          >>>>><br>
                          >>>>> If that still doesn't
                          work there is a DRM_DEBUG in <br>
                          >>>>> amdgpu_ih_process(), make
                          that a DRM_ERROR and send me the <br>
                          >>>>> resulting dmesg of
                          loading amdgpu (but don't start any UMD).<br>
                          >>>>><br>
                          >>>>> Thanks,<br>
                          >>>>> Christian.<br>
                          >>>>><br>
                          >>>>>><br>
                          >>>>>><br>
                          >>>>>> Tom<br>
                          >>>>>><br>
                          >>>>>>><br>
                          >>>>>>> Christian.<br>
                          >>>>>>><br>
                          >>>>>>> Am 18.09.2018 um
                          15:27 schrieb Tom St Denis:<br>
                          >>>>>>>> This commit:<br>
                          >>>>>>>><br>
                          >>>>>>>> [root@raven
                          linux]# git bisect good<br>
                          >>>>>>>>
                          9b0df0937a852d299fbe42a5939c9a8a4cc83c55 is
                          the first bad commit<br>
                          >>>>>>>> commit
                          9b0df0937a852d299fbe42a5939c9a8a4cc83c55<br>
                          >>>>>>>> Author:
                          Christian König <a
                            class="moz-txt-link-rfc2396E"
                            href="mailto:christian.koenig@amd.com"
                            moz-do-not-send="true"><christian.koenig@amd.com></a><br>
                          >>>>>>>> Date:   Tue
                          Sep 18 10:38:09 2018 +0200<br>
                          >>>>>>>><br>
                          >>>>>>>>    
                          drm/amdgpu: remove fence fallback<br>
                          >>>>>>>><br>
                          >>>>>>>>     DC
                          doesn't seem to have a fallback path either.<br>
                          >>>>>>>><br>
                          >>>>>>>>     So when
                          interrupts doesn't work any more we are pretty
                          much <br>
                          >>>>>>>> busted no<br>
                          >>>>>>>>     matter
                          what.<br>
                          >>>>>>>><br>
                          >>>>>>>>    
                          Signed-off-by: Christian König <a
                            class="moz-txt-link-rfc2396E"
                            href="mailto:christian.koenig@amd.com"
                            moz-do-not-send="true"><christian.koenig@amd.com></a><br>
                          >>>>>>>>    
                          Reviewed-by: Chunming Zhou <a
                            class="moz-txt-link-rfc2396E"
                            href="mailto:david1.zhou@amd.com"
                            moz-do-not-send="true"><david1.zhou@amd.com></a><br>
                          >>>>>>>><br>
                          >>>>>>>> Results in
                          this:<br>
                          >>>>>>>><br>
                          >>>>>>>> [  
                          24.334025] [drm] Initialized amdgpu 3.27.0
                          20150101 for <br>
                          >>>>>>>> 0000:07:00.0
                          on minor 1<br>
                          >>>>>>>> [  
                          24.335674] modprobe (3895) used greatest stack
                          depth: 12600 <br>
                          >>>>>>>> bytes left<br>
                          >>>>>>>> [  
                          26.272358] [drm:gfx_v8_0_ring_test_ib
                          [amdgpu]] *ERROR* <br>
                          >>>>>>>> amdgpu: IB
                          test timed out.<br>
                          >>>>>>>> [  
                          26.272460] [drm:amdgpu_ib_ring_tests [amdgpu]]
                          *ERROR* <br>
                          >>>>>>>> amdgpu:
                          failed testing IB on ring 9 (-110).<br>
                          >>>>>>>> [  
                          26.407885] [drm:process_one_work] *ERROR* ib
                          ring test <br>
                          >>>>>>>> failed
                          (-110).<br>
                          >>>>>>>> [  
                          28.506708] fuse init (API version 7.27)<br>
                          >>>>>>>><br>
                          >>>>>>>> On init with
                          my polaris/raven1 system.<br>
                          >>>>>>>><br>
                          >>>>>>>> Cheers,<br>
                          >>>>>>>> Tom<br>
                          >>>>>>>>
                          _______________________________________________<br>
                          >>>>>>>> amd-gfx
                          mailing list<br>
                          >>>>>>>> <a
                            class="moz-txt-link-abbreviated"
                            href="mailto:amd-gfx@lists.freedesktop.org"
                            moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                          >>>>>>>> <a
                            href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
                            moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
                          >>>>>>><br>
                          >>>>>><br>
                          >>>>><br>
                          >>>><br>
                          >>><br>
                          >><br>
                          ><br>
                          <br>
_______________________________________________<br>
                          amd-gfx mailing list<br>
                          <a class="moz-txt-link-abbreviated"
                            href="mailto:amd-gfx@lists.freedesktop.org"
                            moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                          <a
                            href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
                            moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
                        </div>
                      </span></font></div>
                  <br>
                  <fieldset class="mimeAttachmentHeader"></fieldset>
                  <br>
                  <pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
                </blockquote>
                <br>
              </blockquote>
              <br>
              <br>
              <fieldset class="mimeAttachmentHeader"></fieldset>
              <br>
              <pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
            </blockquote>
            <br>
            <br>
            <fieldset class="mimeAttachmentHeader"></fieldset>
            <br>
            <pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
          </blockquote>
          <br>
          <br>
          <fieldset class="mimeAttachmentHeader"></fieldset>
          <br>
          <pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
        </blockquote>
        <br>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>