<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Hi Luis,<br>
      <br>
      well was "drm/amdgpu: defer test IBs on the rings at boot (V3)"
      does is delaying the IB test a bit and running it async to the
      rest of the bootup.<br>
      <br>
      So what most likely happens is that some hardware feature (like
      power or clock gating) which doesn't works correctly on your
      system kicks in and lets the IB test fail.<br>
      <br>
      It's rather likely that this problem is also responsible for the
      crashes you expect later on. So I think we should concentrate on
      fixing that.<br>
      <br>
      Regards,<br>
      Christian.<br>
      <br>
      Am 11.07.2018 um 23:27 schrieb Luís Mendes:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAEzXK1q5OdOaYzLRw9YWmAeOUQxGjb2R8jTEjQX1SPNhTGsS_w@mail.gmail.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr">
        <div>Hi Jim,</div>
        <div><br>
        </div>
        <div>I followed your suggestion and was able to bisect the
          kernel patches.</div>
        <div>The offending patch is: drm/amdgpu: defer test IBs on the
          rings at boot (V3)<br>
        </div>
        <div>commit:
          <table summary="commit info" class="gmail-commit-info">
            <tbody>
              <tr>
                <th><br>
                </th>
                <td colspan="2" class="gmail-sha1"><a
href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.18-rc4&id=2c773de2ecb8c327f2448bd1eecad224e9227087"
                    moz-do-not-send="true">2c773de2ecb8c327f2448bd1eecad224e9227087</a></td>
              </tr>
            </tbody>
          </table>
        </div>
        <div><br>
        </div>
        <div>After reverting this patch the IB test succeeded with
          kernel v4.18-rc4 on both systems and the amdgpu driver was
          correctly loaded both on SAPPHIRE RX550 4GB and on SAPPHIRE
          RX460 2GB.</div>
        <div><br>
        </div>
        <div>The GPU hang remains, however.<br>
        </div>
        <div> I will try to configure a remote IPMI connection to see
          what is happening with the kernel boot or setup a serial
          console for the Kernel.</div>
        <div><br>
        </div>
        <div>Thanks & Regards,</div>
        <div>Luís<br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Wed, Jul 11, 2018 at 10:56 AM, jimqu
          <span dir="ltr"><<a href="mailto:jimqu@amd.com"
              target="_blank" moz-do-not-send="true">jimqu@amd.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <p>HI Luis,</p>
              <p><br>
              </p>
              <p>Let us trace the issue one by one.</p>
              <p><br>
              </p>
              <p>IB test fail:</p>
              <p>This should be regression issue on 4.18, you can bisect
                the kernel patches.</p>
              <p>GPU hang:</p>
              <p>Fix IB test fail first.</p>
              <p><br>
              </p>
              <p>Thanks</p>
              <span class="HOEnZb"><font color="#888888">
                  <p>JimQu<br>
                  </p>
                </font></span>
              <div>
                <div class="h5">
                  <p><br>
                  </p>
                  <br>
                  <div class="m_-5542977703135971300moz-cite-prefix">On
                    2018年07月11日 17:34, Luís Mendes wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div>Hi Jim,</div>
                      <div><br>
                      </div>
                      <div>Thanks for your interest in this issue.
                        Actually this is a multiple issue... not only
                        the IB ring test is failing... as I am having
                        quite some trouble getting the cards SAPPHIRE RX
                        550 4GB on a Tyan S7025 and SAPPHIRE RX 460 2GB
                        on a TYAN S7002 to work, both systems using same
                        Ubuntu 18.04 with vanilla kernel.<br>
                      </div>
                      <div><br>
                      </div>
                      <div><b>1. May you also test earlier kernel? v4.17
                          or v4.16.</b><br>
                      </div>
                      <div>I've tested kernels v4.17.5 and v4.16.6 with
                        same system and both are able to pass the IB
                        ring test and system boots into X using NVIDIA
                        as the display connected card.</div>
                      <div>dmesg log attached for kernel 4.17.5, file
                        TYAN_S7025_kernelv4.17.5_<wbr>amdgpu_IB_ring_test_OK.txt.<br>
                      </div>
                      <div><br>
                      </div>
                      <div><b>2. May you test the issue only with
                          amdgpu?</b></div>
                      <div>
                        <div>- I've tested on a TYAN S7002 system with a
                          single SAPPHIRE RX 460 2GB, on-board VGA
                          enabled and used as primary display.</div>
                        <div>Kernel v4.18-rc4 fails the IB ring test,
                          system is able to enter X through the on-board
                          VGA. <br>
                        </div>
                        <div>dmesg log attached for kernel 4.18-rc4,
                          file TYAN_S7002_kernel_v4.18-rc4_<wbr>IB_ring_test_fail.txt.</div>
                        <div><br>
                        </div>
                        <div>- Same TYAN S7002 system, but now with
                          on-board VGA disabled and using RX 460 as
                          display connected card.<br>
                        </div>
                        <div>
                          <div>Kernels v4.17.5 and v4.16.6 are able to
                            pass the IB ring test, but GPU hangs before
                            entering X. Don't have logs for these yet.<br>
                          </div>
                          <br>
                          <div>Regards,</div>
                          <div>Luís Mendes</div>
                          <div>Aparapi contributor and MSc Researcher<br>
                          </div>
                          <div><br>
                          </div>
                          <div><br>
                          </div>
                          <br>
                        </div>
                        <br>
                      </div>
                    </div>
                    <div class="gmail_extra"><br>
                      <div class="gmail_quote">On Wed, Jul 11, 2018 at
                        3:49 AM, Qu, Jim <span dir="ltr"><<a
                            href="mailto:Jim.Qu@amd.com" target="_blank"
                            moz-do-not-send="true">Jim.Qu@amd.com</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">Hi Luis,<br>
                          <br>
                          1. May you also test earlier kernel? v4.17 or
                          v4.16.<br>
                          2. May you test the issue only with amdgpu?<br>
                          <br>
                          Thanks<br>
                          JimQu<br>
                          <br>
                          ______________________________<wbr>__________<br>
                          发件人: amd-gfx <<a
                            href="mailto:amd-gfx-bounces@lists.freedesktop.org"
                            target="_blank" moz-do-not-send="true">amd-gfx-bounces@lists.freedes<wbr>ktop.org</a>>
                          代表 Luís Mendes <<a
                            href="mailto:luis.p.mendes@gmail.com"
                            target="_blank" moz-do-not-send="true">luis.p.mendes@gmail.com</a>><br>
                          发送时间: 2018年7月11日 6:04:00<br>
                          收件人: Michel Dänzer; Koenig, Christian; amd-gfx
                          list<br>
                          主题: Re: Regression with kernel 4.18 - AMD RX
                          550 fails IB ring test on power-up<br>
                          <span class="m_-5542977703135971300im
                            m_-5542977703135971300HOEnZb"><br>
                            Hi,<br>
                            <br>
                            Issue remains in kernel 4.18-rc4 using
                            SAPPHIRE RX 550 4GB.<br>
                            <br>
                            Logs follow attached.<br>
                            <br>
                            Regards,<br>
                            Luis<br>
                            <br>
                          </span>
                          <div class="m_-5542977703135971300HOEnZb">
                            <div class="m_-5542977703135971300h5">On
                              Tue, Jun 26, 2018 at 10:08 AM, Luís Mendes
                              <<a
                                href="mailto:luis.p.mendes@gmail.com"
                                target="_blank" moz-do-not-send="true">luis.p.mendes@gmail.com</a><mailt<wbr>o:<a
                                href="mailto:luis.p.mendes@gmail.com"
                                target="_blank" moz-do-not-send="true">luis.p.mendes@gmail.com</a>>>
                              wrote:<br>
                              Hi,<br>
                              <br>
                              I've tried kernel 4.18-rc2 on a system
                              with a NVIDIA GTX 1050 Ti and an AMD RX
                              550 4GB and the RX 550 card is failing the
                              IB ring test.<br>
                              <br>
                              [    5.033217] [drm:gfx_v8_0_ring_test_ib
                              [amdgpu]] *ERROR* amdgpu: ib test failed
                              (scratch(0xC040)=0xFFFFFFFF)<br>
                              [    5.033264] [drm:amdgpu_ib_ring_tests
                              [amdgpu]] *ERROR* amdgpu: failed testing
                              IB on ring 6 (-22).<br>
                              <br>
                              Please see the attached log.<br>
                              <br>
                              Regards,<br>
                              Luís<br>
                              <br>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                      <br>
                    </div>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>