<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body>
    Mhm, I fear we at least need to comment the binary or otherwise we
    have a source code license violation here.<br>
    <br>
    The only alternative is to have it as a firmware binary externally.<br>
    <br>
    Christian.<br>
    <br>
    <div class="moz-cite-prefix">Am 27.04.21 um 22:21 schrieb Deucher,
      Alexander:<br>
    </div>
    <blockquote type="cite"
cite="mid:MN2PR12MB4488EB63FD9E79897959201FF7419@MN2PR12MB4488.namprd12.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <style type="text/css" style="display:none;">P {margin-top:0;margin-bottom:0;}</style>
      <p
        style="font-family:Arial;font-size:11pt;color:#0078D7;margin:5pt;"
        align="Left">
        [AMD Official Use Only - Internal Distribution Only]<br>
      </p>
      <br>
      <div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);">
          I mean, we wrote it in binary since they were so small.  I
          don't remember how the newer ones for vega20 and Arcturus we
          generated.<br>
        </div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);">
          <br>
        </div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);">
          Alex</div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0);">
          <br>
        </div>
        <hr style="display:inline-block;width:98%" tabindex="-1">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
            face="Calibri, sans-serif" color="#000000"><b>From:</b>
            Zeng, Oak <a class="moz-txt-link-rfc2396E" href="mailto:Oak.Zeng@amd.com"><Oak.Zeng@amd.com></a><br>
            <b>Sent:</b> Tuesday, April 27, 2021 4:08 PM<br>
            <b>To:</b> Deucher, Alexander
            <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>; Koenig, Christian
            <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a>; Zhang, Hawking
            <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com"><Hawking.Zhang@amd.com></a>; Christian König
            <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a>; Li, Dennis
            <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a>; <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Kuehling, Felix
            <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com"><Felix.Kuehling@amd.com></a><br>
            <b>Subject:</b> Re: [PATCH] drm/amdgpu: fix no full coverage
            issue for gprs initialization</font>
          <div> </div>
        </div>
        <style>@font-face
        {font-family:"Cambria Math"}@font-face
        {font-family:DengXian}@font-face
        {font-family:Calibri}@font-face
        {}@font-face
        {}p.x_MsoNormal, li.x_MsoNormal, div.x_MsoNormal
        {margin:0cm;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif}a:link, span.x_MsoHyperlink
        {color:blue;
        text-decoration:underline}span.x_EmailStyle19
        {font-family:"Calibri",sans-serif;
        color:windowtext}.x_MsoChpDefault
        {font-size:10.0pt}div.x_WordSection1
        {}</style>
        <div link="blue" vlink="purple" style="word-wrap:break-word"
          lang="EN-CA">
          <div class="x_WordSection1">
            <p class="x_MsoNormal">Yes in that case, we can check in the
              hand writing assembly codes.</p>
            <p class="x_MsoNormal"> </p>
            <div>
              <div>
                <div>
                  <p class="x_MsoNormal">Regards,</p>
                  <p class="x_MsoNormal">Oak </p>
                </div>
              </div>
            </div>
            <p class="x_MsoNormal"> </p>
            <p class="x_MsoNormal"> </p>
            <div style="border:none; border-top:solid #B5C4DF 1.0pt;
              padding:3.0pt 0cm 0cm 0cm">
              <p class="x_MsoNormal" style="margin-left:36.0pt"><b><span
                    style="font-size:12.0pt; color:black">From:
                  </span></b><span style="font-size:12.0pt; color:black">"Deucher,
                  Alexander" <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a><br>
                  <b>Date: </b>Tuesday, April 27, 2021 at 4:06 PM<br>
                  <b>To: </b>Oak Zeng <a class="moz-txt-link-rfc2396E" href="mailto:Oak.Zeng@amd.com"><Oak.Zeng@amd.com></a>,
                  "Koenig, Christian" <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a>,
                  "Zhang, Hawking" <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com"><Hawking.Zhang@amd.com></a>,
                  Christian König
                  <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a>, "Li, Dennis"
                  <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a>,
                  <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org">"amd-gfx@lists.freedesktop.org"</a>
                  <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>, "Kuehling,
                  Felix" <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com"><Felix.Kuehling@amd.com></a><br>
                  <b>Subject: </b>Re: [PATCH] drm/amdgpu: fix no full
                  coverage issue for gprs initialization</span></p>
            </div>
            <div>
              <p class="x_MsoNormal" style="margin-left:36.0pt"> </p>
            </div>
            <p style="margin-right:5.0pt; margin-bottom:5.0pt;
              margin-left:41.0pt"><span
                style="font-family:"Arial",sans-serif;
                color:#0078D7">[AMD Official Use Only - Internal
                Distribution Only]</span></p>
            <p class="x_MsoNormal" style="margin-left:36.0pt"> </p>
            <div>
              <div>
                <p class="x_MsoNormal" style="margin-left:36.0pt"><span
                    style="font-size:12.0pt; color:black">That would
                    probably be helpful.  TBH, I think we hand wrote the
                    original one for CZ so there was no original higher
                    level source code.</span></p>
              </div>
              <div>
                <p class="x_MsoNormal" style="margin-left:36.0pt"><span
                    style="font-size:12.0pt; color:black"> </span></p>
              </div>
              <div>
                <p class="x_MsoNormal" style="margin-left:36.0pt"><span
                    style="font-size:12.0pt; color:black">Alex</span></p>
              </div>
              <div>
                <p class="x_MsoNormal" style="margin-left:36.0pt"><span
                    style="font-size:12.0pt; color:black"> </span></p>
              </div>
              <div class="x_MsoNormal" style="margin-left:36.0pt;
                text-align:center" align="center">
                <hr width="100%" size="0" align="center">
              </div>
              <div id="x_divRplyFwdMsg">
                <p class="x_MsoNormal" style="margin-left:36.0pt"><b><span
                      style="color:black">From:</span></b><span
                    style="color:black"> Zeng, Oak
                    <a class="moz-txt-link-rfc2396E" href="mailto:Oak.Zeng@amd.com"><Oak.Zeng@amd.com></a><br>
                    <b>Sent:</b> Tuesday, April 27, 2021 3:34 PM<br>
                    <b>To:</b> Koenig, Christian
                    <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a>; Zhang, Hawking
                    <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com"><Hawking.Zhang@amd.com></a>; Christian König
                    <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a>; Li, Dennis
                    <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a>;
                    <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
                    <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Deucher,
                    Alexander <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>;
                    Kuehling, Felix <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com"><Felix.Kuehling@amd.com></a><br>
                    <b>Subject:</b> Re: [PATCH] drm/amdgpu: fix no full
                    coverage issue for gprs initialization</span>
                </p>
                <div>
                  <p class="x_MsoNormal" style="margin-left:36.0pt"> </p>
                </div>
              </div>
              <div>
                <div>
                  <p class="x_MsoNormal" style="margin-right:0cm;
                    margin-bottom:12.0pt; margin-left:36.0pt">
                    Hi Dennis,<br>
                    <br>
                    Should we check in the compute shader source codes?
                    I only saw the shader binaries. This will be helpful
                    if people want to modify those shaders/fix issues.
                    The source code can be in a comment section above
                    the binary.<br>
                    <br>
                    Regards,<br>
                    Oak <br>
                    <br>
                     <br>
                    <br>
                    On 2021-04-27, 11:31 AM, "amd-gfx on behalf of
                    Christian König"
                    <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx-bounces@lists.freedesktop.orgonbehalfofchristian.koenig@amd.com"><amd-gfx-bounces@lists.freedesktop.org on behalf
                    of christian.koenig@amd.com></a> wrote:<br>
                    <br>
                        Ok in this case looks good to me.<br>
                    <br>
                        Christian.<br>
                    <br>
                        Am 27.04.21 um 17:26 schrieb Zhang, Hawking:<br>
                        > [AMD Public Use]<br>
                        ><br>
                        > This need to be done during reset as well.<br>
                        ><br>
                        > Regards,<br>
                        > Hawking<br>
                        ><br>
                        > -----Original Message-----<br>
                        > From: Christian König
                    <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a><br>
                        > Sent: Tuesday, April 27, 2021 23:17<br>
                        > To: Zhang, Hawking
                    <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com"><Hawking.Zhang@amd.com></a>; Li, Dennis
                    <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a>;
                    <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>; Deucher, Alexander
                    <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>; Kuehling, Felix
                    <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com"><Felix.Kuehling@amd.com></a>; Koenig, Christian
                    <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
                        > Subject: Re: [PATCH] drm/amdgpu: fix no
                    full coverage issue for gprs initialization<br>
                        ><br>
                        > This is only done during bootup, isn't it?<br>
                        ><br>
                        > Wouldn't it be better to use the normal IB
                    pool instead of the direct one? Or do we also need
                    to do this during GPU reset?<br>
                        ><br>
                        > Regards,<br>
                        > Christian.<br>
                        ><br>
                        > Am 27.04.21 um 16:55 schrieb Zhang,
                    Hawking:<br>
                        >> [AMD Public Use]<br>
                        >><br>
                        >> Please split the following into another
                    patch when you commit the one.<br>
                        >> Other than that, the patch is<br>
                        >><br>
                        >> Reviewed-by: Hawking Zhang
                    <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com"><Hawking.Zhang@amd.com></a><br>
                        >><br>
                        >> Regards,<br>
                        >> Hawking<br>
                        >><br>
                        >> @@ -479,8 +710,6 @@ void
                    gfx_v9_4_2_init_golden_registers(struct
                    amdgpu_device *adev,<br>
                        >>                            die_id);<br>
                        >>                   break;<br>
                        >>           }<br>
                        >> -<br>
                        >> -        return;<br>
                        >>    }<br>
                        >><br>
                        >> -----Original Message-----<br>
                        >> From: Dennis Li
                    <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a><br>
                        >> Sent: Tuesday, April 27, 2021 22:38<br>
                        >> To: <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>;
                    Deucher, Alexander<br>
                        >> <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>;
                    Kuehling, Felix <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com"><Felix.Kuehling@amd.com></a>;<br>
                        >> Zhang, Hawking
                    <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com"><Hawking.Zhang@amd.com></a>; Koenig, Christian<br>
                        >> <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
                        >> Cc: Li, Dennis
                    <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a><br>
                        >> Subject: [PATCH] drm/amdgpu: fix no
                    full coverage issue for gprs<br>
                        >> initialization<br>
                        >><br>
                        >> The number of waves is changed to 8, so
                    it is impossible to use old solution to cover all
                    sgprs.<br>
                        >><br>
                        >> Signed-off-by: Dennis Li
                    <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a><br>
                        >><br>
                        >> diff --git
                    a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c<br>
                        >>
                    b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c<br>
                        >> index a2fe2dac32c1..2e6789a7dc46 100644<br>
                        >> ---
                    a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c<br>
                        >> +++
                    b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c<br>
                        >> @@ -328,7 +328,7 @@ int
                    amdgpu_ib_pool_init(struct amdgpu_device<br>
                        >> *adev)<br>
                        >>    <br>
                        >>           for (i = 0; i <
                    AMDGPU_IB_POOL_MAX; i++) {<br>
                        >>                   if (i ==
                    AMDGPU_IB_POOL_DIRECT)<br>
                        >> -                        size =
                    PAGE_SIZE * 2;<br>
                        >> +                        size =
                    PAGE_SIZE * 6;<br>
                        >>                   else<br>
                        >>                           size =
                    AMDGPU_IB_POOL_SIZE;<br>
                        >>    <br>
                        >> diff --git
                    a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c<br>
                        >>
                    b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c<br>
                        >> index d17e57dea178..77948c033c45 100644<br>
                        >> ---
                    a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c<br>
                        >> +++
                    b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c<br>
                        >> @@ -32,6 +32,11 @@<br>
                        >>    #include "amdgpu_ras.h"<br>
                        >>    #include "amdgpu_gfx.h"<br>
                        >>    <br>
                        >> +#define SE_ID_MAX 8<br>
                        >> +#define CU_ID_MAX 16<br>
                        >> +#define SIMD_ID_MAX 4<br>
                        >> +#define WAVE_ID_MAX 10<br>
                        >> +<br>
                        >>    enum gfx_v9_4_2_utc_type {<br>
                        >>           VML2_MEM,<br>
                        >>           VML2_WALKER_MEM,<br>
                        >> @@ -81,100 +86,100 @@ static const
                    struct soc15_reg_golden<br>
                        >> golden_settings_gc_9_4_2_alde[] = {  };<br>
                        >>    <br>
                        >>    static const u32
                    vgpr_init_compute_shader_aldebaran[] = {<br>
                        >> -        0xb8840904, 0xb8851a04,
                    0xb8861344, 0x9207c006, 0x92088405, 0x81070807,<br>
                        >> -        0x81070407, 0x8e078207,
                    0xbe88008f, 0xc0410200, 0x00000007, 0xd3d94000,<br>
                        >> -        0x18000080, 0xd3d94001,
                    0x18000080, 0xd3d94002, 0x18000080, 0xd3d94003,<br>
                        >> -        0x18000080, 0xd3d94004,
                    0x18000080, 0xd3d94005, 0x18000080, 0xd3d94006,<br>
                        >> -        0x18000080, 0xd3d94007,
                    0x18000080, 0xd3d94008, 0x18000080, 0xd3d94009,<br>
                        >> -        0x18000080, 0xd3d9400a,
                    0x18000080, 0xd3d9400b, 0x18000080, 0xd3d9400c,<br>
                        >> -        0x18000080, 0xd3d9400d,
                    0x18000080, 0xd3d9400e, 0x18000080, 0xd3d9400f,<br>
                        >> -        0x18000080, 0xd3d94010,
                    0x18000080, 0xd3d94011, 0x18000080, 0xd3d94012,<br>
                        >> -        0x18000080, 0xd3d94013,
                    0x18000080, 0xd3d94014, 0x18000080, 0xd3d94015,<br>
                        >> -        0x18000080, 0xd3d94016,
                    0x18000080, 0xd3d94017, 0x18000080, 0xd3d94018,<br>
                        >> -        0x18000080, 0xd3d94019,
                    0x18000080, 0xd3d9401a, 0x18000080, 0xd3d9401b,<br>
                        >> -        0x18000080, 0xd3d9401c,
                    0x18000080, 0xd3d9401d, 0x18000080, 0xd3d9401e,<br>
                        >> -        0x18000080, 0xd3d9401f,
                    0x18000080, 0xd3d94020, 0x18000080, 0xd3d94021,<br>
                        >> -        0x18000080, 0xd3d94022,
                    0x18000080, 0xd3d94023, 0x18000080, 0xd3d94024,<br>
                        >> -        0x18000080, 0xd3d94025,
                    0x18000080, 0xd3d94026, 0x18000080, 0xd3d94027,<br>
                        >> -        0x18000080, 0xd3d94028,
                    0x18000080, 0xd3d94029, 0x18000080, 0xd3d9402a,<br>
                        >> -        0x18000080, 0xd3d9402b,
                    0x18000080, 0xd3d9402c, 0x18000080, 0xd3d9402d,<br>
                        >> -        0x18000080, 0xd3d9402e,
                    0x18000080, 0xd3d9402f, 0x18000080, 0xd3d94030,<br>
                        >> -        0x18000080, 0xd3d94031,
                    0x18000080, 0xd3d94032, 0x18000080, 0xd3d94033,<br>
                        >> -        0x18000080, 0xd3d94034,
                    0x18000080, 0xd3d94035, 0x18000080, 0xd3d94036,<br>
                        >> -        0x18000080, 0xd3d94037,
                    0x18000080, 0xd3d94038, 0x18000080, 0xd3d94039,<br>
                        >> -        0x18000080, 0xd3d9403a,
                    0x18000080, 0xd3d9403b, 0x18000080, 0xd3d9403c,<br>
                        >> -        0x18000080, 0xd3d9403d,
                    0x18000080, 0xd3d9403e, 0x18000080, 0xd3d9403f,<br>
                        >> -        0x18000080, 0xd3d94040,
                    0x18000080, 0xd3d94041, 0x18000080, 0xd3d94042,<br>
                        >> -        0x18000080, 0xd3d94043,
                    0x18000080, 0xd3d94044, 0x18000080, 0xd3d94045,<br>
                        >> -        0x18000080, 0xd3d94046,
                    0x18000080, 0xd3d94047, 0x18000080, 0xd3d94048,<br>
                        >> -        0x18000080, 0xd3d94049,
                    0x18000080, 0xd3d9404a, 0x18000080, 0xd3d9404b,<br>
                        >> -        0x18000080, 0xd3d9404c,
                    0x18000080, 0xd3d9404d, 0x18000080, 0xd3d9404e,<br>
                        >> -        0x18000080, 0xd3d9404f,
                    0x18000080, 0xd3d94050, 0x18000080, 0xd3d94051,<br>
                        >> -        0x18000080, 0xd3d94052,
                    0x18000080, 0xd3d94053, 0x18000080, 0xd3d94054,<br>
                        >> -        0x18000080, 0xd3d94055,
                    0x18000080, 0xd3d94056, 0x18000080, 0xd3d94057,<br>
                        >> -        0x18000080, 0xd3d94058,
                    0x18000080, 0xd3d94059, 0x18000080, 0xd3d9405a,<br>
                        >> -        0x18000080, 0xd3d9405b,
                    0x18000080, 0xd3d9405c, 0x18000080, 0xd3d9405d,<br>
                        >> -        0x18000080, 0xd3d9405e,
                    0x18000080, 0xd3d9405f, 0x18000080, 0xd3d94060,<br>
                        >> -        0x18000080, 0xd3d94061,
                    0x18000080, 0xd3d94062, 0x18000080, 0xd3d94063,<br>
                        >> -        0x18000080, 0xd3d94064,
                    0x18000080, 0xd3d94065, 0x18000080, 0xd3d94066,<br>
                        >> -        0x18000080, 0xd3d94067,
                    0x18000080, 0xd3d94068, 0x18000080, 0xd3d94069,<br>
                        >> -        0x18000080, 0xd3d9406a,
                    0x18000080, 0xd3d9406b, 0x18000080, 0xd3d9406c,<br>
                        >> -        0x18000080, 0xd3d9406d,
                    0x18000080, 0xd3d9406e, 0x18000080, 0xd3d9406f,<br>
                        >> -        0x18000080, 0xd3d94070,
                    0x18000080, 0xd3d94071, 0x18000080, 0xd3d94072,<br>
                        >> -        0x18000080, 0xd3d94073,
                    0x18000080, 0xd3d94074, 0x18000080, 0xd3d94075,<br>
                        >> -        0x18000080, 0xd3d94076,
                    0x18000080, 0xd3d94077, 0x18000080, 0xd3d94078,<br>
                        >> -        0x18000080, 0xd3d94079,
                    0x18000080, 0xd3d9407a, 0x18000080, 0xd3d9407b,<br>
                        >> -        0x18000080, 0xd3d9407c,
                    0x18000080, 0xd3d9407d, 0x18000080, 0xd3d9407e,<br>
                        >> -        0x18000080, 0xd3d9407f,
                    0x18000080, 0xd3d94080, 0x18000080, 0xd3d94081,<br>
                        >> -        0x18000080, 0xd3d94082,
                    0x18000080, 0xd3d94083, 0x18000080, 0xd3d94084,<br>
                        >> -        0x18000080, 0xd3d94085,
                    0x18000080, 0xd3d94086, 0x18000080, 0xd3d94087,<br>
                        >> -        0x18000080, 0xd3d94088,
                    0x18000080, 0xd3d94089, 0x18000080, 0xd3d9408a,<br>
                        >> -        0x18000080, 0xd3d9408b,
                    0x18000080, 0xd3d9408c, 0x18000080, 0xd3d9408d,<br>
                        >> -        0x18000080, 0xd3d9408e,
                    0x18000080, 0xd3d9408f, 0x18000080, 0xd3d94090,<br>
                        >> -        0x18000080, 0xd3d94091,
                    0x18000080, 0xd3d94092, 0x18000080, 0xd3d94093,<br>
                        >> -        0x18000080, 0xd3d94094,
                    0x18000080, 0xd3d94095, 0x18000080, 0xd3d94096,<br>
                        >> -        0x18000080, 0xd3d94097,
                    0x18000080, 0xd3d94098, 0x18000080, 0xd3d94099,<br>
                        >> -        0x18000080, 0xd3d9409a,
                    0x18000080, 0xd3d9409b, 0x18000080, 0xd3d9409c,<br>
                        >> -        0x18000080, 0xd3d9409d,
                    0x18000080, 0xd3d9409e, 0x18000080, 0xd3d9409f,<br>
                        >> -        0x18000080, 0xd3d940a0,
                    0x18000080, 0xd3d940a1, 0x18000080, 0xd3d940a2,<br>
                        >> -        0x18000080, 0xd3d940a3,
                    0x18000080, 0xd3d940a4, 0x18000080, 0xd3d940a5,<br>
                        >> -        0x18000080, 0xd3d940a6,
                    0x18000080, 0xd3d940a7, 0x18000080, 0xd3d940a8,<br>
                        >> -        0x18000080, 0xd3d940a9,
                    0x18000080, 0xd3d940aa, 0x18000080, 0xd3d940ab,<br>
                        >> -        0x18000080, 0xd3d940ac,
                    0x18000080, 0xd3d940ad, 0x18000080, 0xd3d940ae,<br>
                        >> -        0x18000080, 0xd3d940af,
                    0x18000080, 0xd3d940b0, 0x18000080, 0xd3d940b1,<br>
                        >> -        0x18000080, 0xd3d940b2,
                    0x18000080, 0xd3d940b3, 0x18000080, 0xd3d940b4,<br>
                        >> -        0x18000080, 0xd3d940b5,
                    0x18000080, 0xd3d940b6, 0x18000080, 0xd3d940b7,<br>
                        >> -        0x18000080, 0xd3d940b8,
                    0x18000080, 0xd3d940b9, 0x18000080, 0xd3d940ba,<br>
                        >> -        0x18000080, 0xd3d940bb,
                    0x18000080, 0xd3d940bc, 0x18000080, 0xd3d940bd,<br>
                        >> -        0x18000080, 0xd3d940be,
                    0x18000080, 0xd3d940bf, 0x18000080, 0xd3d940c0,<br>
                        >> -        0x18000080, 0xd3d940c1,
                    0x18000080, 0xd3d940c2, 0x18000080, 0xd3d940c3,<br>
                        >> -        0x18000080, 0xd3d940c4,
                    0x18000080, 0xd3d940c5, 0x18000080, 0xd3d940c6,<br>
                        >> -        0x18000080, 0xd3d940c7,
                    0x18000080, 0xd3d940c8, 0x18000080, 0xd3d940c9,<br>
                        >> -        0x18000080, 0xd3d940ca,
                    0x18000080, 0xd3d940cb, 0x18000080, 0xd3d940cc,<br>
                        >> -        0x18000080, 0xd3d940cd,
                    0x18000080, 0xd3d940ce, 0x18000080, 0xd3d940cf,<br>
                        >> -        0x18000080, 0xd3d940d0,
                    0x18000080, 0xd3d940d1, 0x18000080, 0xd3d940d2,<br>
                        >> -        0x18000080, 0xd3d940d3,
                    0x18000080, 0xd3d940d4, 0x18000080, 0xd3d940d5,<br>
                        >> -        0x18000080, 0xd3d940d6,
                    0x18000080, 0xd3d940d7, 0x18000080, 0xd3d940d8,<br>
                        >> -        0x18000080, 0xd3d940d9,
                    0x18000080, 0xd3d940da, 0x18000080, 0xd3d940db,<br>
                        >> -        0x18000080, 0xd3d940dc,
                    0x18000080, 0xd3d940dd, 0x18000080, 0xd3d940de,<br>
                        >> -        0x18000080, 0xd3d940df,
                    0x18000080, 0xd3d940e0, 0x18000080, 0xd3d940e1,<br>
                        >> -        0x18000080, 0xd3d940e2,
                    0x18000080, 0xd3d940e3, 0x18000080, 0xd3d940e4,<br>
                        >> -        0x18000080, 0xd3d940e5,
                    0x18000080, 0xd3d940e6, 0x18000080, 0xd3d940e7,<br>
                        >> -        0x18000080, 0xd3d940e8,
                    0x18000080, 0xd3d940e9, 0x18000080, 0xd3d940ea,<br>
                        >> -        0x18000080, 0xd3d940eb,
                    0x18000080, 0xd3d940ec, 0x18000080, 0xd3d940ed,<br>
                        >> -        0x18000080, 0xd3d940ee,
                    0x18000080, 0xd3d940ef, 0x18000080, 0xd3d940f0,<br>
                        >> -        0x18000080, 0xd3d940f1,
                    0x18000080, 0xd3d940f2, 0x18000080, 0xd3d940f3,<br>
                        >> -        0x18000080, 0xd3d940f4,
                    0x18000080, 0xd3d940f5, 0x18000080, 0xd3d940f6,<br>
                        >> -        0x18000080, 0xd3d940f7,
                    0x18000080, 0xd3d940f8, 0x18000080, 0xd3d940f9,<br>
                        >> -        0x18000080, 0xd3d940fa,
                    0x18000080, 0xd3d940fb, 0x18000080, 0xd3d940fc,<br>
                        >> -        0x18000080, 0xd3d940fd,
                    0x18000080, 0xd3d940fe, 0x18000080, 0xd3d940ff,<br>
                        >> -        0x18000080, 0xb07c0000,
                    0xbe8a00ff, 0x000000f8, 0xbf11080a, 0x7e000280,<br>
                        >> -        0x7e020280, 0x7e040280,
                    0x7e060280, 0x7e080280, 0x7e0a0280, 0x7e0c0280,<br>
                        >> -        0x7e0e0280, 0x808a880a,
                    0xbe80320a, 0xbf84fff5, 0xbf9c0000, 0xd28c0001,<br>
                        >> -        0x0001007f, 0xd28d0001,
                    0x0002027e, 0x10020288, 0xb88b0904, 0xb78b4000,<br>
                        >> -        0xd1196a01, 0x00001701,
                    0xbe8a0087, 0xbefc00c1, 0xd89c4000, 0x00020201,<br>
                        >> -        0xd89cc080, 0x00040401,
                    0x320202ff, 0x00000800, 0x808a810a, 0xbf84fff8,<br>
                        >> -        0xbf810000,<br>
                        >> +        0xb8840904, 0xb8851a04,
                    0xb8861344, 0xb8831804, 0x9208ff06, 0x00000280,<br>
                        >> +        0x9209a805, 0x920a8a04,
                    0x81080908, 0x81080a08, 0x81080308, 0x8e078208,<br>
                        >> +        0x81078407, 0xc0410080,
                    0x00000007, 0xbf8c0000, 0xd3d94000, 0x18000080,<br>
                        >> +        0xd3d94001, 0x18000080,
                    0xd3d94002, 0x18000080, 0xd3d94003, 0x18000080,<br>
                        >> +        0xd3d94004, 0x18000080,
                    0xd3d94005, 0x18000080, 0xd3d94006, 0x18000080,<br>
                        >> +        0xd3d94007, 0x18000080,
                    0xd3d94008, 0x18000080, 0xd3d94009, 0x18000080,<br>
                        >> +        0xd3d9400a, 0x18000080,
                    0xd3d9400b, 0x18000080, 0xd3d9400c, 0x18000080,<br>
                        >> +        0xd3d9400d, 0x18000080,
                    0xd3d9400e, 0x18000080, 0xd3d9400f, 0x18000080,<br>
                        >> +        0xd3d94010, 0x18000080,
                    0xd3d94011, 0x18000080, 0xd3d94012, 0x18000080,<br>
                        >> +        0xd3d94013, 0x18000080,
                    0xd3d94014, 0x18000080, 0xd3d94015, 0x18000080,<br>
                        >> +        0xd3d94016, 0x18000080,
                    0xd3d94017, 0x18000080, 0xd3d94018, 0x18000080,<br>
                        >> +        0xd3d94019, 0x18000080,
                    0xd3d9401a, 0x18000080, 0xd3d9401b, 0x18000080,<br>
                        >> +        0xd3d9401c, 0x18000080,
                    0xd3d9401d, 0x18000080, 0xd3d9401e, 0x18000080,<br>
                        >> +        0xd3d9401f, 0x18000080,
                    0xd3d94020, 0x18000080, 0xd3d94021, 0x18000080,<br>
                        >> +        0xd3d94022, 0x18000080,
                    0xd3d94023, 0x18000080, 0xd3d94024, 0x18000080,<br>
                        >> +        0xd3d94025, 0x18000080,
                    0xd3d94026, 0x18000080, 0xd3d94027, 0x18000080,<br>
                        >> +        0xd3d94028, 0x18000080,
                    0xd3d94029, 0x18000080, 0xd3d9402a, 0x18000080,<br>
                        >> +        0xd3d9402b, 0x18000080,
                    0xd3d9402c, 0x18000080, 0xd3d9402d, 0x18000080,<br>
                        >> +        0xd3d9402e, 0x18000080,
                    0xd3d9402f, 0x18000080, 0xd3d94030, 0x18000080,<br>
                        >> +        0xd3d94031, 0x18000080,
                    0xd3d94032, 0x18000080, 0xd3d94033, 0x18000080,<br>
                        >> +        0xd3d94034, 0x18000080,
                    0xd3d94035, 0x18000080, 0xd3d94036, 0x18000080,<br>
                        >> +        0xd3d94037, 0x18000080,
                    0xd3d94038, 0x18000080, 0xd3d94039, 0x18000080,<br>
                        >> +        0xd3d9403a, 0x18000080,
                    0xd3d9403b, 0x18000080, 0xd3d9403c, 0x18000080,<br>
                        >> +        0xd3d9403d, 0x18000080,
                    0xd3d9403e, 0x18000080, 0xd3d9403f, 0x18000080,<br>
                        >> +        0xd3d94040, 0x18000080,
                    0xd3d94041, 0x18000080, 0xd3d94042, 0x18000080,<br>
                        >> +        0xd3d94043, 0x18000080,
                    0xd3d94044, 0x18000080, 0xd3d94045, 0x18000080,<br>
                        >> +        0xd3d94046, 0x18000080,
                    0xd3d94047, 0x18000080, 0xd3d94048, 0x18000080,<br>
                        >> +        0xd3d94049, 0x18000080,
                    0xd3d9404a, 0x18000080, 0xd3d9404b, 0x18000080,<br>
                        >> +        0xd3d9404c, 0x18000080,
                    0xd3d9404d, 0x18000080, 0xd3d9404e, 0x18000080,<br>
                        >> +        0xd3d9404f, 0x18000080,
                    0xd3d94050, 0x18000080, 0xd3d94051, 0x18000080,<br>
                        >> +        0xd3d94052, 0x18000080,
                    0xd3d94053, 0x18000080, 0xd3d94054, 0x18000080,<br>
                        >> +        0xd3d94055, 0x18000080,
                    0xd3d94056, 0x18000080, 0xd3d94057, 0x18000080,<br>
                        >> +        0xd3d94058, 0x18000080,
                    0xd3d94059, 0x18000080, 0xd3d9405a, 0x18000080,<br>
                        >> +        0xd3d9405b, 0x18000080,
                    0xd3d9405c, 0x18000080, 0xd3d9405d, 0x18000080,<br>
                        >> +        0xd3d9405e, 0x18000080,
                    0xd3d9405f, 0x18000080, 0xd3d94060, 0x18000080,<br>
                        >> +        0xd3d94061, 0x18000080,
                    0xd3d94062, 0x18000080, 0xd3d94063, 0x18000080,<br>
                        >> +        0xd3d94064, 0x18000080,
                    0xd3d94065, 0x18000080, 0xd3d94066, 0x18000080,<br>
                        >> +        0xd3d94067, 0x18000080,
                    0xd3d94068, 0x18000080, 0xd3d94069, 0x18000080,<br>
                        >> +        0xd3d9406a, 0x18000080,
                    0xd3d9406b, 0x18000080, 0xd3d9406c, 0x18000080,<br>
                        >> +        0xd3d9406d, 0x18000080,
                    0xd3d9406e, 0x18000080, 0xd3d9406f, 0x18000080,<br>
                        >> +        0xd3d94070, 0x18000080,
                    0xd3d94071, 0x18000080, 0xd3d94072, 0x18000080,<br>
                        >> +        0xd3d94073, 0x18000080,
                    0xd3d94074, 0x18000080, 0xd3d94075, 0x18000080,<br>
                        >> +        0xd3d94076, 0x18000080,
                    0xd3d94077, 0x18000080, 0xd3d94078, 0x18000080,<br>
                        >> +        0xd3d94079, 0x18000080,
                    0xd3d9407a, 0x18000080, 0xd3d9407b, 0x18000080,<br>
                        >> +        0xd3d9407c, 0x18000080,
                    0xd3d9407d, 0x18000080, 0xd3d9407e, 0x18000080,<br>
                        >> +        0xd3d9407f, 0x18000080,
                    0xd3d94080, 0x18000080, 0xd3d94081, 0x18000080,<br>
                        >> +        0xd3d94082, 0x18000080,
                    0xd3d94083, 0x18000080, 0xd3d94084, 0x18000080,<br>
                        >> +        0xd3d94085, 0x18000080,
                    0xd3d94086, 0x18000080, 0xd3d94087, 0x18000080,<br>
                        >> +        0xd3d94088, 0x18000080,
                    0xd3d94089, 0x18000080, 0xd3d9408a, 0x18000080,<br>
                        >> +        0xd3d9408b, 0x18000080,
                    0xd3d9408c, 0x18000080, 0xd3d9408d, 0x18000080,<br>
                        >> +        0xd3d9408e, 0x18000080,
                    0xd3d9408f, 0x18000080, 0xd3d94090, 0x18000080,<br>
                        >> +        0xd3d94091, 0x18000080,
                    0xd3d94092, 0x18000080, 0xd3d94093, 0x18000080,<br>
                        >> +        0xd3d94094, 0x18000080,
                    0xd3d94095, 0x18000080, 0xd3d94096, 0x18000080,<br>
                        >> +        0xd3d94097, 0x18000080,
                    0xd3d94098, 0x18000080, 0xd3d94099, 0x18000080,<br>
                        >> +        0xd3d9409a, 0x18000080,
                    0xd3d9409b, 0x18000080, 0xd3d9409c, 0x18000080,<br>
                        >> +        0xd3d9409d, 0x18000080,
                    0xd3d9409e, 0x18000080, 0xd3d9409f, 0x18000080,<br>
                        >> +        0xd3d940a0, 0x18000080,
                    0xd3d940a1, 0x18000080, 0xd3d940a2, 0x18000080,<br>
                        >> +        0xd3d940a3, 0x18000080,
                    0xd3d940a4, 0x18000080, 0xd3d940a5, 0x18000080,<br>
                        >> +        0xd3d940a6, 0x18000080,
                    0xd3d940a7, 0x18000080, 0xd3d940a8, 0x18000080,<br>
                        >> +        0xd3d940a9, 0x18000080,
                    0xd3d940aa, 0x18000080, 0xd3d940ab, 0x18000080,<br>
                        >> +        0xd3d940ac, 0x18000080,
                    0xd3d940ad, 0x18000080, 0xd3d940ae, 0x18000080,<br>
                        >> +        0xd3d940af, 0x18000080,
                    0xd3d940b0, 0x18000080, 0xd3d940b1, 0x18000080,<br>
                        >> +        0xd3d940b2, 0x18000080,
                    0xd3d940b3, 0x18000080, 0xd3d940b4, 0x18000080,<br>
                        >> +        0xd3d940b5, 0x18000080,
                    0xd3d940b6, 0x18000080, 0xd3d940b7, 0x18000080,<br>
                        >> +        0xd3d940b8, 0x18000080,
                    0xd3d940b9, 0x18000080, 0xd3d940ba, 0x18000080,<br>
                        >> +        0xd3d940bb, 0x18000080,
                    0xd3d940bc, 0x18000080, 0xd3d940bd, 0x18000080,<br>
                        >> +        0xd3d940be, 0x18000080,
                    0xd3d940bf, 0x18000080, 0xd3d940c0, 0x18000080,<br>
                        >> +        0xd3d940c1, 0x18000080,
                    0xd3d940c2, 0x18000080, 0xd3d940c3, 0x18000080,<br>
                        >> +        0xd3d940c4, 0x18000080,
                    0xd3d940c5, 0x18000080, 0xd3d940c6, 0x18000080,<br>
                        >> +        0xd3d940c7, 0x18000080,
                    0xd3d940c8, 0x18000080, 0xd3d940c9, 0x18000080,<br>
                        >> +        0xd3d940ca, 0x18000080,
                    0xd3d940cb, 0x18000080, 0xd3d940cc, 0x18000080,<br>
                        >> +        0xd3d940cd, 0x18000080,
                    0xd3d940ce, 0x18000080, 0xd3d940cf, 0x18000080,<br>
                        >> +        0xd3d940d0, 0x18000080,
                    0xd3d940d1, 0x18000080, 0xd3d940d2, 0x18000080,<br>
                        >> +        0xd3d940d3, 0x18000080,
                    0xd3d940d4, 0x18000080, 0xd3d940d5, 0x18000080,<br>
                        >> +        0xd3d940d6, 0x18000080,
                    0xd3d940d7, 0x18000080, 0xd3d940d8, 0x18000080,<br>
                        >> +        0xd3d940d9, 0x18000080,
                    0xd3d940da, 0x18000080, 0xd3d940db, 0x18000080,<br>
                        >> +        0xd3d940dc, 0x18000080,
                    0xd3d940dd, 0x18000080, 0xd3d940de, 0x18000080,<br>
                        >> +        0xd3d940df, 0x18000080,
                    0xd3d940e0, 0x18000080, 0xd3d940e1, 0x18000080,<br>
                        >> +        0xd3d940e2, 0x18000080,
                    0xd3d940e3, 0x18000080, 0xd3d940e4, 0x18000080,<br>
                        >> +        0xd3d940e5, 0x18000080,
                    0xd3d940e6, 0x18000080, 0xd3d940e7, 0x18000080,<br>
                        >> +        0xd3d940e8, 0x18000080,
                    0xd3d940e9, 0x18000080, 0xd3d940ea, 0x18000080,<br>
                        >> +        0xd3d940eb, 0x18000080,
                    0xd3d940ec, 0x18000080, 0xd3d940ed, 0x18000080,<br>
                        >> +        0xd3d940ee, 0x18000080,
                    0xd3d940ef, 0x18000080, 0xd3d940f0, 0x18000080,<br>
                        >> +        0xd3d940f1, 0x18000080,
                    0xd3d940f2, 0x18000080, 0xd3d940f3, 0x18000080,<br>
                        >> +        0xd3d940f4, 0x18000080,
                    0xd3d940f5, 0x18000080, 0xd3d940f6, 0x18000080,<br>
                        >> +        0xd3d940f7, 0x18000080,
                    0xd3d940f8, 0x18000080, 0xd3d940f9, 0x18000080,<br>
                        >> +        0xd3d940fa, 0x18000080,
                    0xd3d940fb, 0x18000080, 0xd3d940fc, 0x18000080,<br>
                        >> +        0xd3d940fd, 0x18000080,
                    0xd3d940fe, 0x18000080, 0xd3d940ff, 0x18000080,<br>
                        >> +        0xb07c0000, 0xbe8a00ff,
                    0x000000f8, 0xbf11080a, 0x7e000280, 0x7e020280,<br>
                        >> +        0x7e040280, 0x7e060280,
                    0x7e080280, 0x7e0a0280, 0x7e0c0280, 0x7e0e0280,<br>
                        >> +        0x808a880a, 0xbe80320a,
                    0xbf84fff5, 0xbf9c0000, 0xd28c0001, 0x0001007f,<br>
                        >> +        0xd28d0001, 0x0002027e,
                    0x10020288, 0xbe8b0004, 0xb78b4000, 0xd1196a01,<br>
                        >> +        0x00001701, 0xbe8a0087,
                    0xbefc00c1, 0xd89c4000, 0x00020201, 0xd89cc080,<br>
                        >> +        0x00040401, 0x320202ff,
                    0x00000800, 0x808a810a, 0xbf84fff8,<br>
                        >> +0xbf810000,<br>
                        >>    };<br>
                        >>    <br>
                        >>    const struct soc15_reg_entry
                    vgpr_init_regs_aldebaran[] = { @@ -183,7 +188,7 @@
                    const struct soc15_reg_entry
                    vgpr_init_regs_aldebaran[] = {<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Y), 4 },<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Z), 1 },<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC1), 0xbf },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC2), 0x400004 },  /* 64KB LDS */<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC2), 0x400006 },  /* 64KB<br>
                        >> +LDS */<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC3), 0x3F }, /*  63 - accum-offset
                    = 256 */<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE0), 0xffffffff },<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE1), 0xffffffff }, @@
                    -195,262 +200,488 @@ const struct soc15_reg_entry
                    vgpr_init_regs_aldebaran[] = {<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE7),<br>
                        >> 0xffffffff },  };<br>
                        >>    <br>
                        >> -static const u32
                    sgpr_init_compute_shader_aldebaran[] = {<br>
                        >> -        0xb8840904, 0xb8851a04,
                    0xb8861344, 0x9207c006, 0x92088405, 0x81070807,<br>
                        >> -        0x81070407, 0x8e078207,
                    0xbefc0006, 0xbf800000, 0xbf900001, 0xbe88008f,<br>
                        >> -        0xc0410200, 0x00000007,
                    0xb07c0000, 0xbe8000ff, 0x0000005f, 0xbee50080,<br>
                        >> -        0xbe812c65, 0xbe822c65,
                    0xbe832c65, 0xbe842c65, 0xbe852c65, 0xb77c0005,<br>
                        >> -        0x80808500, 0xbf84fff8,
                    0xbe800080, 0xbf810000,<br>
                        >> +static const u32
                    sgpr112_init_compute_shader_aldebaran[] = {<br>
                        >> +        0xb8840904, 0xb8851a04,
                    0xb8861344, 0xb8831804, 0x9208ff06, 0x00000280,<br>
                        >> +        0x9209a805, 0x920a8a04,
                    0x81080908, 0x81080a08, 0x81080308, 0x8e078208,<br>
                        >> +        0x81078407, 0xc0410080,
                    0x00000007, 0xbf8c0000, 0xbf8e003f, 0xc0030200,<br>
                        >> +        0x00000000, 0xbf8c0000,
                    0xbf06ff08, 0xdeadbeaf, 0xbf84fff9, 0x81028102,<br>
                        >> +        0xc0410080, 0x00000007,
                    0xbf8c0000, 0xbefc0080, 0xbe880080, 0xbe890080,<br>
                        >> +        0xbe8a0080, 0xbe8b0080,
                    0xbe8c0080, 0xbe8d0080, 0xbe8e0080, 0xbe8f0080,<br>
                        >> +        0xbe900080, 0xbe910080,
                    0xbe920080, 0xbe930080, 0xbe940080, 0xbe950080,<br>
                        >> +        0xbe960080, 0xbe970080,
                    0xbe980080, 0xbe990080, 0xbe9a0080, 0xbe9b0080,<br>
                        >> +        0xbe9c0080, 0xbe9d0080,
                    0xbe9e0080, 0xbe9f0080, 0xbea00080, 0xbea10080,<br>
                        >> +        0xbea20080, 0xbea30080,
                    0xbea40080, 0xbea50080, 0xbea60080, 0xbea70080,<br>
                        >> +        0xbea80080, 0xbea90080,
                    0xbeaa0080, 0xbeab0080, 0xbeac0080, 0xbead0080,<br>
                        >> +        0xbeae0080, 0xbeaf0080,
                    0xbeb00080, 0xbeb10080, 0xbeb20080, 0xbeb30080,<br>
                        >> +        0xbeb40080, 0xbeb50080,
                    0xbeb60080, 0xbeb70080, 0xbeb80080, 0xbeb90080,<br>
                        >> +        0xbeba0080, 0xbebb0080,
                    0xbebc0080, 0xbebd0080, 0xbebe0080, 0xbebf0080,<br>
                        >> +        0xbec00080, 0xbec10080,
                    0xbec20080, 0xbec30080, 0xbec40080, 0xbec50080,<br>
                        >> +        0xbec60080, 0xbec70080,
                    0xbec80080, 0xbec90080, 0xbeca0080, 0xbecb0080,<br>
                        >> +        0xbecc0080, 0xbecd0080,
                    0xbece0080, 0xbecf0080, 0xbed00080, 0xbed10080,<br>
                        >> +        0xbed20080, 0xbed30080,
                    0xbed40080, 0xbed50080, 0xbed60080, 0xbed70080,<br>
                        >> +        0xbed80080, 0xbed90080,
                    0xbeda0080, 0xbedb0080, 0xbedc0080, 0xbedd0080,<br>
                        >> +        0xbede0080, 0xbedf0080,
                    0xbee00080, 0xbee10080, 0xbee20080, 0xbee30080,<br>
                        >> +        0xbee40080, 0xbee50080,
                    0xbf810000<br>
                        >>    };<br>
                        >>    <br>
                        >> -static const struct soc15_reg_entry
                    sgpr1_init_regs_aldebaran[] = {<br>
                        >> +const struct soc15_reg_entry
                    sgpr112_init_regs_aldebaran[] = {<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_RESOURCE_LIMITS), 0x0000000 },<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_X), 0x40 },<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Y), 8 },<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Z), 1 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC1), 0x240 }, /* (80 GPRS):
                    SGPRS[9:6] VGPRS[5:0] */<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC2), 0x4 }, /* USER_SGPR[5:1]*/<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC3), 0x3F }, /*  63 - accum-offset
                    = 256 */<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE0), 0x000000ff },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE1), 0x000000ff },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE2), 0x000000ff },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE3), 0x000000ff },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE4), 0x000000ff },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE5), 0x000000ff },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE6), 0x000000ff },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE7), 0x000000ff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC1), 0x2c0 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC2), 0x6 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC3), 0x0 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE0), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE1), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE2), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE3), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE4), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE5), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE6), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE7),<br>
                        >> +0xffffffff }, };<br>
                        >> +<br>
                        >> +static const u32
                    sgpr96_init_compute_shader_aldebaran[] = {<br>
                        >> +        0xb8840904, 0xb8851a04,
                    0xb8861344, 0xb8831804, 0x9208ff06, 0x00000280,<br>
                        >> +        0x9209a805, 0x920a8a04,
                    0x81080908, 0x81080a08, 0x81080308, 0x8e078208,<br>
                        >> +        0x81078407, 0xc0410080,
                    0x00000007, 0xbf8c0000, 0xbf8e003f, 0xc0030200,<br>
                        >> +        0x00000000, 0xbf8c0000,
                    0xbf06ff08, 0xdeadbeaf, 0xbf84fff9, 0x81028102,<br>
                        >> +        0xc0410080, 0x00000007,
                    0xbf8c0000, 0xbefc0080, 0xbe880080, 0xbe890080,<br>
                        >> +        0xbe8a0080, 0xbe8b0080,
                    0xbe8c0080, 0xbe8d0080, 0xbe8e0080, 0xbe8f0080,<br>
                        >> +        0xbe900080, 0xbe910080,
                    0xbe920080, 0xbe930080, 0xbe940080, 0xbe950080,<br>
                        >> +        0xbe960080, 0xbe970080,
                    0xbe980080, 0xbe990080, 0xbe9a0080, 0xbe9b0080,<br>
                        >> +        0xbe9c0080, 0xbe9d0080,
                    0xbe9e0080, 0xbe9f0080, 0xbea00080, 0xbea10080,<br>
                        >> +        0xbea20080, 0xbea30080,
                    0xbea40080, 0xbea50080, 0xbea60080, 0xbea70080,<br>
                        >> +        0xbea80080, 0xbea90080,
                    0xbeaa0080, 0xbeab0080, 0xbeac0080, 0xbead0080,<br>
                        >> +        0xbeae0080, 0xbeaf0080,
                    0xbeb00080, 0xbeb10080, 0xbeb20080, 0xbeb30080,<br>
                        >> +        0xbeb40080, 0xbeb50080,
                    0xbeb60080, 0xbeb70080, 0xbeb80080, 0xbeb90080,<br>
                        >> +        0xbeba0080, 0xbebb0080,
                    0xbebc0080, 0xbebd0080, 0xbebe0080, 0xbebf0080,<br>
                        >> +        0xbec00080, 0xbec10080,
                    0xbec20080, 0xbec30080, 0xbec40080, 0xbec50080,<br>
                        >> +        0xbec60080, 0xbec70080,
                    0xbec80080, 0xbec90080, 0xbeca0080, 0xbecb0080,<br>
                        >> +        0xbecc0080, 0xbecd0080,
                    0xbece0080, 0xbecf0080, 0xbed00080, 0xbed10080,<br>
                        >> +        0xbed20080, 0xbed30080,
                    0xbed40080, 0xbed50080, 0xbed60080, 0xbed70080,<br>
                        >> +        0xbed80080, 0xbed90080,
                    0xbf810000,<br>
                        >>    };<br>
                        >>    <br>
                        >> -static const struct soc15_reg_entry
                    sgpr2_init_regs_aldebaran[] = {<br>
                        >> +const struct soc15_reg_entry
                    sgpr96_init_regs_aldebaran[] = {<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_RESOURCE_LIMITS), 0x0000000 },<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_X), 0x40 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Y), 8 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Y), 0xc },<br>
                        >>           { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Z), 1 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC1), 0x240 }, /* (80 GPRS) */<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC2), 0x4 }, /* USER_SGPR[5:1]*/<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC3), 0x3F }, /*  63 - accum-offset
                    = 256 */<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE0), 0x0000ff00 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE1), 0x0000ff00 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE2), 0x0000ff00 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE3), 0x0000ff00 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE4), 0x0000ff00 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE5), 0x0000ff00 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE6), 0x0000ff00 },<br>
                        >> -        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE7), 0x0000ff00 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC1), 0x240 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC2), 0x6 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC3), 0x0 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE0), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE1), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE2), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE3), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE4), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE5), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE6), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE7),<br>
                        >> +0xffffffff },<br>
                        >>    };<br>
                        >>    <br>
                        >> -static int
                    gfx_v9_4_2_check_gprs_init_coverage(struct
                    amdgpu_device *adev,<br>
                        >>
                    -                                              
                    uint32_t *wb)<br>
                        >> -{<br>
                        >> -        uint32_t se_id, cu_id,
                    simd_id;<br>
                        >> -        uint32_t simd_cnt = 0;<br>
                        >> -        uint32_t se_offset, cu_offset,
                    data;<br>
                        >> -<br>
                        >> -        for (se_id = 0; se_id <
                    adev->gfx.config.max_shader_engines; se_id++) {<br>
                        >> -                se_offset = se_id * 16
                    * 4;<br>
                        >> -                for (cu_id = 0; cu_id
                    < 16; cu_id++) {<br>
                        >> -                        cu_offset =
                    cu_id * 4;<br>
                        >> -                        for (simd_id =
                    0; simd_id < 4; simd_id++) {<br>
                        >> -                                data =
                    wb[se_offset + cu_offset + simd_id];<br>
                        >> -                                if
                    (data == 0xF)<br>
                        >>
                    -                                        simd_cnt++;<br>
                        >> -                        }<br>
                        >> -                }<br>
                        >> -        }<br>
                        >> -<br>
                        >> -        if
                    (adev->gfx.cu_info.number * 4 == simd_cnt)<br>
                        >> -                return 0;<br>
                        >> -<br>
                        >> -        dev_warn(adev->dev, "SIMD
                    Count: %d, %d\n",<br>
                        >> -                
                    adev->gfx.cu_info.number * 4, simd_cnt);<br>
                        >> -<br>
                        >> -        for (se_id = 0; se_id <
                    adev->gfx.config.max_shader_engines; se_id++) {<br>
                        >> -                se_offset = se_id * 16
                    * 4;<br>
                        >> -                for (cu_id = 0; cu_id
                    < 16; cu_id++) {<br>
                        >> -                        cu_offset =
                    cu_id * 4;<br>
                        >> -                        for (simd_id =
                    0; simd_id < 4; simd_id++) {<br>
                        >> -                                data =
                    wb[se_offset + cu_offset + simd_id];<br>
                        >> -                                if
                    (data != 0xF)<br>
                        >>
                    -                                       
                    dev_warn(adev->dev, "SE[%d]CU[%d]SIMD[%d]: isn't
                    inited\n",<br>
                        >>
                    -                                               
                    se_id, cu_id, simd_id);<br>
                        >> -                        }<br>
                        >> -                }<br>
                        >> -        }<br>
                        >> +static const u32
                    sgpr64_init_compute_shader_aldebaran[] = {<br>
                        >> +        0xb8840904, 0xb8851a04,
                    0xb8861344, 0xb8831804, 0x9208ff06, 0x00000280,<br>
                        >> +        0x9209a805, 0x920a8a04,
                    0x81080908, 0x81080a08, 0x81080308, 0x8e078208,<br>
                        >> +        0x81078407, 0xc0410080,
                    0x00000007, 0xbf8c0000, 0xbefc0080, 0xbe880080,<br>
                        >> +        0xbe890080, 0xbe8a0080,
                    0xbe8b0080, 0xbe8c0080, 0xbe8d0080, 0xbe8e0080,<br>
                        >> +        0xbe8f0080, 0xbe900080,
                    0xbe910080, 0xbe920080, 0xbe930080, 0xbe940080,<br>
                        >> +        0xbe950080, 0xbe960080,
                    0xbe970080, 0xbe980080, 0xbe990080, 0xbe9a0080,<br>
                        >> +        0xbe9b0080, 0xbe9c0080,
                    0xbe9d0080, 0xbe9e0080, 0xbe9f0080, 0xbea00080,<br>
                        >> +        0xbea10080, 0xbea20080,
                    0xbea30080, 0xbea40080, 0xbea50080, 0xbea60080,<br>
                        >> +        0xbea70080, 0xbea80080,
                    0xbea90080, 0xbeaa0080, 0xbeab0080, 0xbeac0080,<br>
                        >> +        0xbead0080, 0xbeae0080,
                    0xbeaf0080, 0xbeb00080, 0xbeb10080, 0xbeb20080,<br>
                        >> +        0xbeb30080, 0xbeb40080,
                    0xbeb50080, 0xbeb60080, 0xbeb70080, 0xbeb80080,<br>
                        >> +        0xbeb90080, 0xbf810000,<br>
                        >> +};<br>
                        >>    <br>
                        >> -        return -EFAULT;<br>
                        >> -}<br>
                        >> +const struct soc15_reg_entry
                    sgpr64_init_regs_aldebaran[] = {<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_RESOURCE_LIMITS), 0x0000000 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_X), 0x40 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Y), 0x10 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_NUM_THREAD_Z), 1 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC1), 0x1c0 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC2), 0x6 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_PGM_RSRC3), 0x0 },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE0), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE1), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE2), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE3), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE4), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE5), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE6), 0xffffffff },<br>
                        >> +        { SOC15_REG_ENTRY(GC, 0,
                    regCOMPUTE_STATIC_THREAD_MGMT_SE7),<br>
                        >> +0xffffffff }, };<br>
                        >>    <br>
                        >>    static int
                    gfx_v9_4_2_run_shader(struct amdgpu_device *adev,<br>
                        >> -                                 const
                    uint32_t *shader_ptr, uint32_t shader_size,<br>
                        >> -                                 const
                    struct soc15_reg_entry *init_regs, uint32_t
                    regs_size,<br>
                        >> -                                
                    uint32_t compute_dim_x, u64 wb_gpu_addr)<br>
                        >> +                                
                    struct amdgpu_ring *ring,<br>
                        >> +                                
                    struct amdgpu_ib *ib,<br>
                        >> +                                 const
                    u32 *shader_ptr, u32 shader_size,<br>
                        >> +                                 const
                    struct soc15_reg_entry *init_regs, u32 regs_size,<br>
                        >> +                                 u32
                    compute_dim_x, u64 wb_gpu_addr, u32 pattern,<br>
                        >> +                                
                    struct dma_fence **fence_ptr)<br>
                        >>    {<br>
                        >> -        struct amdgpu_ring *ring =
                    &adev->gfx.compute_ring[0];<br>
                        >> -        struct amdgpu_ib ib;<br>
                        >> -        struct dma_fence *f = NULL;<br>
                        >>           int r, i;<br>
                        >>           uint32_t total_size,
                    shader_offset;<br>
                        >>           u64 gpu_addr;<br>
                        >>    <br>
                        >> -        total_size = (regs_size * 3 +
                    4 + 4 + 5 + 2) * 4;<br>
                        >> +        total_size = (regs_size * 3 +
                    4 + 5 + 5) * 4;<br>
                        >>           total_size =
                    ALIGN(total_size, 256);<br>
                        >>           shader_offset = total_size;<br>
                        >>           total_size +=
                    ALIGN(shader_size, 256);<br>
                        >>    <br>
                        >>           /* allocate an indirect
                    buffer to put the commands in */<br>
                        >> -        memset(&ib, 0,
                    sizeof(ib));<br>
                        >> +        memset(ib, 0, sizeof(*ib));<br>
                        >>           r = amdgpu_ib_get(adev, NULL,
                    total_size,<br>
                        >>
                    -                                       
                    AMDGPU_IB_POOL_DIRECT, &ib);<br>
                        >>
                    +                                       
                    AMDGPU_IB_POOL_DIRECT, ib);<br>
                        >>           if (r) {<br>
                        >> -                DRM_ERROR("amdgpu:
                    failed to get ib (%d).\n", r);<br>
                        >> +                dev_err(adev->dev,
                    "failed to get ib (%d).\n", r);<br>
                        >>                   return r;<br>
                        >>           }<br>
                        >>    <br>
                        >>           /* load the compute shaders
                    */<br>
                        >>           for (i = 0; i <
                    shader_size/sizeof(u32); i++)<br>
                        >> -                ib.ptr[i +
                    (shader_offset / 4)] = shader_ptr[i];<br>
                        >> +                ib->ptr[i +
                    (shader_offset / 4)] = shader_ptr[i];<br>
                        >>    <br>
                        >>           /* init the ib length to 0 */<br>
                        >> -        ib.length_dw = 0;<br>
                        >> +        ib->length_dw = 0;<br>
                        >>    <br>
                        >>           /* write the register state
                    for the compute dispatch */<br>
                        >>           for (i = 0; i < regs_size;
                    i++) {<br>
                        >> -                ib.ptr[ib.length_dw++]
                    = PACKET3(PACKET3_SET_SH_REG, 1);<br>
                        >> -                ib.ptr[ib.length_dw++]
                    = SOC15_REG_ENTRY_OFFSET(init_regs[i])<br>
                        >> +               
                    ib->ptr[ib->length_dw++] =
                    PACKET3(PACKET3_SET_SH_REG, 1);<br>
                        >> +               
                    ib->ptr[ib->length_dw++] =
                    SOC15_REG_ENTRY_OFFSET(init_regs[i])<br>
                       
>>                                                                  
                    - PACKET3_SET_SH_REG_START;<br>
                        >> -                ib.ptr[ib.length_dw++]
                    = init_regs[i].reg_value;<br>
                        >> +               
                    ib->ptr[ib->length_dw++] =
                    init_regs[i].reg_value;<br>
                        >>           }<br>
                        >>    <br>
                        >>           /* write the shader start
                    address: mmCOMPUTE_PGM_LO, mmCOMPUTE_PGM_HI */<br>
                        >> -        gpu_addr = (ib.gpu_addr +
                    (u64)shader_offset) >> 8;<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    PACKET3(PACKET3_SET_SH_REG, 2);<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    SOC15_REG_OFFSET(GC, 0, regCOMPUTE_PGM_LO)<br>
                        >> +        gpu_addr = (ib->gpu_addr +
                    (u64)shader_offset) >> 8;<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = PACKET3(PACKET3_SET_SH_REG, 2);<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = SOC15_REG_OFFSET(GC, 0,<br>
                        >> +regCOMPUTE_PGM_LO)<br>
                       
                    >>                                                          
                    - PACKET3_SET_SH_REG_START;<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    lower_32_bits(gpu_addr);<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    upper_32_bits(gpu_addr);<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = lower_32_bits(gpu_addr);<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = upper_32_bits(gpu_addr);<br>
                        >>    <br>
                        >>           /* write the wb buffer
                    address */<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    PACKET3(PACKET3_SET_SH_REG, 2);<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    SOC15_REG_OFFSET(GC, 0, regCOMPUTE_USER_DATA_0)<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = PACKET3(PACKET3_SET_SH_REG, 3);<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = SOC15_REG_OFFSET(GC, 0,<br>
                        >> +regCOMPUTE_USER_DATA_0)<br>
                       
                    >>                                                          
                    - PACKET3_SET_SH_REG_START;<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    lower_32_bits(wb_gpu_addr);<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    upper_32_bits(wb_gpu_addr);<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = lower_32_bits(wb_gpu_addr);<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = upper_32_bits(wb_gpu_addr);<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = pattern;<br>
                        >>    <br>
                        >>           /* write dispatch packet */<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    PACKET3(PACKET3_DISPATCH_DIRECT, 3);<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    compute_dim_x; /* x */<br>
                        >> -        ib.ptr[ib.length_dw++] = 1; /*
                    y */<br>
                        >> -        ib.ptr[ib.length_dw++] = 1; /*
                    z */<br>
                        >> -        ib.ptr[ib.length_dw++] =<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = PACKET3(PACKET3_DISPATCH_DIRECT, 3);<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = compute_dim_x; /* x */<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = 1; /* y */<br>
                        >> +        ib->ptr[ib->length_dw++]
                    = 1; /* z */<br>
                        >> +        ib->ptr[ib->length_dw++]
                    =<br>
                        >>                   REG_SET_FIELD(0,
                    COMPUTE_DISPATCH_INITIATOR, COMPUTE_SHADER_EN,<br>
                        >> 1);<br>
                        >>    <br>
                        >> -        /* write CS partial flush
                    packet */<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    PACKET3(PACKET3_EVENT_WRITE, 0);<br>
                        >> -        ib.ptr[ib.length_dw++] =
                    EVENT_TYPE(7) | EVENT_INDEX(4);<br>
                        >> -<br>
                        >>           /* shedule the ib on the ring
                    */<br>
                        >> -        r = amdgpu_ib_schedule(ring,
                    1, &ib, NULL, &f);<br>
                        >> +        r = amdgpu_ib_schedule(ring,
                    1, ib, NULL, fence_ptr);<br>
                        >>           if (r) {<br>
                        >> -                DRM_ERROR("amdgpu: ib
                    submit failed (%d).\n", r);<br>
                        >> -                goto fail;<br>
                        >> +                dev_err(adev->dev,
                    "ib submit failed (%d).\n", r);<br>
                        >> +                amdgpu_ib_free(adev,
                    ib, NULL);<br>
                        >>           }<br>
                        >> +        return r;<br>
                        >> +}<br>
                        >>    <br>
                        >> -        /* wait for the GPU to finish
                    processing the IB */<br>
                        >> -        r = dma_fence_wait(f, false);<br>
                        >> -        if (r) {<br>
                        >> -                DRM_ERROR("amdgpu:
                    fence wait failed (%d).\n", r);<br>
                        >> -                goto fail;<br>
                        >> +static void
                    gfx_v9_4_2_log_wave_assignment(struct amdgpu_device<br>
                        >> +*adev, uint32_t *wb_ptr) {<br>
                        >> +        uint32_t se, cu, simd, wave;<br>
                        >> +        uint32_t offset = 0;<br>
                        >> +        char *str;<br>
                        >> +        int size;<br>
                        >> +<br>
                        >> +        str = kmalloc(256,
                    GFP_KERNEL);<br>
                        >> +        if (!str)<br>
                        >> +                return;<br>
                        >> +<br>
                        >> +        dev_dbg(adev->dev, "wave
                    assignment:\n");<br>
                        >> +<br>
                        >> +        for (se = 0; se <
                    adev->gfx.config.max_shader_engines; se++) {<br>
                        >> +                for (cu = 0; cu <
                    CU_ID_MAX; cu++) {<br>
                        >> +                        memset(str, 0,
                    256);<br>
                        >> +                        size =
                    sprintf(str, "SE[%02d]CU[%02d]: ", se, cu);<br>
                        >> +                        for (simd = 0;
                    simd < SIMD_ID_MAX; simd++) {<br>
                        >> +                                size
                    += sprintf(str + size, "[");<br>
                        >> +                                for
                    (wave = 0; wave < WAVE_ID_MAX; wave++) {<br>
                        >>
                    +                                        size +=
                    sprintf(str + size, "%x", wb_ptr[offset]);<br>
                        >>
                    +                                        offset++;<br>
                        >> +                                }<br>
                        >> +                                size
                    += sprintf(str + size, "]  ");<br>
                        >> +                        }<br>
                        >> +                       
                    dev_dbg(adev->dev, "%s\n", str);<br>
                        >> +                }<br>
                        >>           }<br>
                        >> -fail:<br>
                        >> -        amdgpu_ib_free(adev, &ib,
                    NULL);<br>
                        >> -        dma_fence_put(f);<br>
                        >>    <br>
                        >> -        return r;<br>
                        >> +        kfree(str);<br>
                        >>    }<br>
                        >>    <br>
                        >> -int
                    gfx_v9_4_2_do_edc_gpr_workarounds(struct
                    amdgpu_device *adev)<br>
                        >> +static int
                    gfx_v9_4_2_wait_for_waves_assigned(struct
                    amdgpu_device *adev,<br>
                        >>
                    +                                             
                    uint32_t *wb_ptr, uint32_t mask,<br>
                        >>
                    +                                             
                    uint32_t pattern, uint32_t num_wave, bool wait)<br>
                        >>    {<br>
                        >> -        struct amdgpu_ring *ring =
                    &adev->gfx.compute_ring[0];<br>
                        >> -        int r;<br>
                        >> -        int compute_dim_x =
                    adev->gfx.config.max_shader_engines *<br>
                        >> -                           
                    adev->gfx.config.max_cu_per_sh *<br>
                        >> -                           
                    adev->gfx.config.max_sh_per_se;<br>
                        >> -        int sgpr_work_group_size = 5;<br>
                        >> -        /* CU_ID: 0~15, SIMD_ID: 0~3
                    */<br>
                        >> -        int wb_size =
                    adev->gfx.config.max_shader_engines * 16 * 4;<br>
                        >> -        struct amdgpu_ib ib;<br>
                        >> +        uint32_t se, cu, simd, wave;<br>
                        >> +        uint32_t loop = 0;<br>
                        >> +        uint32_t wave_cnt;<br>
                        >> +        uint32_t offset;<br>
                        >>    <br>
                        >> -        /* only support when RAS is
                    enabled */<br>
                        >> -        if
                    (!amdgpu_ras_is_supported(adev,
                    AMDGPU_RAS_BLOCK__GFX))<br>
                        >> -                return 0;<br>
                        >> +        do {<br>
                        >> +                wave_cnt = 0;<br>
                        >> +                offset = 0;<br>
                        >> +<br>
                        >> +                for (se = 0; se <
                    adev->gfx.config.max_shader_engines; se++)<br>
                        >> +                        for (cu = 0;
                    cu < CU_ID_MAX; cu++)<br>
                        >> +                                for
                    (simd = 0; simd < SIMD_ID_MAX; simd++)<br>
                        >>
                    +                                        for (wave =
                    0; wave < WAVE_ID_MAX; wave++) {<br>
                        >>
                    +                                                if
                    (((1 << wave) & mask) &&<br>
                        >>
                    +                                                   
                    (wb_ptr[offset] == pattern))<br>
                        >>
                    +                                                       
                    wave_cnt++;<br>
                        >> +<br>
                        >>
                    +                                               
                    offset++;<br>
                        >>
                    +                                        }<br>
                        >> +<br>
                        >> +                if (wave_cnt ==
                    num_wave)<br>
                        >> +                        return 0;<br>
                        >> +<br>
                        >> +                mdelay(1);<br>
                        >> +        } while (++loop < 2000
                    && wait);<br>
                        >> +<br>
                        >> +        dev_err(adev->dev, "actual
                    wave num: %d, expected wave num: %d\n",<br>
                        >> +                wave_cnt, num_wave);<br>
                        >> +<br>
                        >> +       
                    gfx_v9_4_2_log_wave_assignment(adev, wb_ptr);<br>
                        >> +<br>
                        >> +        return -EBADSLT;<br>
                        >> +}<br>
                        >> +<br>
                        >> +static int
                    gfx_v9_4_2_do_sgprs_init(struct amdgpu_device *adev)
                    {<br>
                        >> +        int r;<br>
                        >> +        int wb_size =
                    adev->gfx.config.max_shader_engines *<br>
                        >> +                         CU_ID_MAX *
                    SIMD_ID_MAX * WAVE_ID_MAX;<br>
                        >> +        struct amdgpu_ib wb_ib;<br>
                        >> +        struct amdgpu_ib disp_ibs[3];<br>
                        >> +        struct dma_fence *fences[3];<br>
                        >> +        u32 pattern[3] = { 0x1, 0x5,
                    0xa };<br>
                        >>    <br>
                        >>           /* bail if the compute ring
                    is not ready */<br>
                        >> -        if (!ring->sched.ready)<br>
                        >> +        if
                    (!adev->gfx.compute_ring[0].sched.ready ||<br>
                        >> +                
                    !adev->gfx.compute_ring[1].sched.ready)<br>
                        >>                   return 0;<br>
                        >>    <br>
                        >> -        /* allocate an indirect buffer
                    to put the commands in */<br>
                        >> -        memset(&ib, 0,
                    sizeof(ib));<br>
                        >> -        r = amdgpu_ib_get(adev, NULL,
                    wb_size * sizeof(uint32_t),<br>
                        >> -                         
                    AMDGPU_IB_POOL_DIRECT, &ib);<br>
                        >> +        /* allocate the write-back
                    buffer from IB */<br>
                        >> +        memset(&wb_ib, 0,
                    sizeof(wb_ib));<br>
                        >> +        r = amdgpu_ib_get(adev, NULL,
                    (1 + wb_size) * sizeof(uint32_t),<br>
                        >> +                         
                    AMDGPU_IB_POOL_DIRECT, &wb_ib);<br>
                        >>           if (r) {<br>
                        >> -                DRM_ERROR("amdgpu:
                    failed to get ib (%d).\n", r);<br>
                        >> +                dev_err(adev->dev,
                    "failed to get ib (%d) for wb\n", r);<br>
                        >>                   return r;<br>
                        >>           }<br>
                        >> +        memset(wb_ib.ptr, 0, (1 +
                    wb_size) * sizeof(uint32_t));<br>
                        >> +<br>
                        >> +        r =
                    gfx_v9_4_2_run_shader(adev,<br>
                        >> +                       
                    &adev->gfx.compute_ring[0],<br>
                        >> +                       
                    &disp_ibs[0],<br>
                        >> +                       
                    sgpr112_init_compute_shader_aldebaran,<br>
                        >> +                       
                    sizeof(sgpr112_init_compute_shader_aldebaran),<br>
                        >> +                       
                    sgpr112_init_regs_aldebaran,<br>
                        >> +                       
                    ARRAY_SIZE(sgpr112_init_regs_aldebaran),<br>
                        >> +                       
                    adev->gfx.cu_info.number,<br>
                        >> +                       
                    wb_ib.gpu_addr, pattern[0], &fences[0]);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "failed to clear first 224 sgprs\n");<br>
                        >> +                goto pro_end;<br>
                        >> +        }<br>
                        >>    <br>
                        >> -        memset(ib.ptr, 0, wb_size *
                    sizeof(uint32_t));<br>
                        >> -        r =
                    gfx_v9_4_2_run_shader(adev,
                    vgpr_init_compute_shader_aldebaran,<br>
                        >> -                                 
                    sizeof(vgpr_init_compute_shader_aldebaran),<br>
                        >> -                                 
                    vgpr_init_regs_aldebaran,<br>
                        >> -                                 
                    ARRAY_SIZE(vgpr_init_regs_aldebaran),<br>
                        >> -                                 
                    compute_dim_x * 2, ib.gpu_addr);<br>
                        >> +        r =
                    gfx_v9_4_2_wait_for_waves_assigned(adev,<br>
                        >> +                       
                    &wb_ib.ptr[1], 0b11,<br>
                        >> +                        pattern[0],<br>
                        >> +                       
                    adev->gfx.cu_info.number * SIMD_ID_MAX * 2,<br>
                        >> +                        true);<br>
                        >>           if (r) {<br>
                        >> -                dev_err(adev->dev,
                    "Init VGPRS: failed to run shader\n");<br>
                        >> -                goto failed;<br>
                        >> +                dev_err(adev->dev,
                    "wave coverage failed when clear first 224
                    sgprs\n");<br>
                        >> +                wb_ib.ptr[0] =
                    0xdeadbeaf; /* stop waves */<br>
                        >> +                goto disp0_failed;<br>
                        >>           }<br>
                        >>    <br>
                        >> -        r =
                    gfx_v9_4_2_check_gprs_init_coverage(adev, ib.ptr);<br>
                        >> +        r =
                    gfx_v9_4_2_run_shader(adev,<br>
                        >> +                       
                    &adev->gfx.compute_ring[1],<br>
                        >> +                       
                    &disp_ibs[1],<br>
                        >> +                       
                    sgpr96_init_compute_shader_aldebaran,<br>
                        >> +                       
                    sizeof(sgpr96_init_compute_shader_aldebaran),<br>
                        >> +                       
                    sgpr96_init_regs_aldebaran,<br>
                        >> +                       
                    ARRAY_SIZE(sgpr96_init_regs_aldebaran),<br>
                        >> +                       
                    adev->gfx.cu_info.number * 2,<br>
                        >> +                       
                    wb_ib.gpu_addr, pattern[1], &fences[1]);<br>
                        >>           if (r) {<br>
                        >> -                dev_err(adev->dev,
                    "Init VGPRS: failed to cover all SIMDs\n");<br>
                        >> -                goto failed;<br>
                        >> -        } else {<br>
                        >> -                dev_info(adev->dev,
                    "Init VGPRS Successfully\n");<br>
                        >> +                dev_err(adev->dev,
                    "failed to clear next 576 sgprs\n");<br>
                        >> +                goto disp0_failed;<br>
                        >> +        }<br>
                        >> +<br>
                        >> +        r =
                    gfx_v9_4_2_wait_for_waves_assigned(adev,<br>
                        >> +                       
                    &wb_ib.ptr[1], 0b11111100,<br>
                        >> +                        pattern[1],
                    adev->gfx.cu_info.number * SIMD_ID_MAX * 6,<br>
                        >> +                        true);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "wave coverage failed when clear first 576
                    sgprs\n");<br>
                        >> +                wb_ib.ptr[0] =
                    0xdeadbeaf; /* stop waves */<br>
                        >> +                goto disp1_failed;<br>
                        >>           }<br>
                        >>    <br>
                        >> -        memset(ib.ptr, 0, wb_size *
                    sizeof(uint32_t));<br>
                        >> -        r =
                    gfx_v9_4_2_run_shader(adev,
                    sgpr_init_compute_shader_aldebaran,<br>
                        >> -                                 
                    sizeof(sgpr_init_compute_shader_aldebaran),<br>
                        >> -                                 
                    sgpr1_init_regs_aldebaran,<br>
                        >> -                                 
                    ARRAY_SIZE(sgpr1_init_regs_aldebaran),<br>
                        >> -                                 
                    compute_dim_x / 2 * sgpr_work_group_size,<br>
                        >> -                                 
                    ib.gpu_addr);<br>
                        >> +        wb_ib.ptr[0] = 0xdeadbeaf; /*
                    stop waves */<br>
                        >> +<br>
                        >> +        /* wait for the GPU to finish
                    processing the IB */<br>
                        >> +        r = dma_fence_wait(fences[0],
                    false);<br>
                        >>           if (r) {<br>
                        >> -                dev_err(adev->dev,
                    "Init SGPRS Part1: failed to run shader\n");<br>
                        >> -                goto failed;<br>
                        >> +                dev_err(adev->dev,
                    "timeout to clear first 224 sgprs\n");<br>
                        >> +                goto disp1_failed;<br>
                        >>           }<br>
                        >>    <br>
                        >> -        r =
                    gfx_v9_4_2_run_shader(adev,
                    sgpr_init_compute_shader_aldebaran,<br>
                        >> -                                 
                    sizeof(sgpr_init_compute_shader_aldebaran),<br>
                        >> -                                 
                    sgpr2_init_regs_aldebaran,<br>
                        >> -                                 
                    ARRAY_SIZE(sgpr2_init_regs_aldebaran),<br>
                        >> -                                 
                    compute_dim_x / 2 * sgpr_work_group_size,<br>
                        >> -                                 
                    ib.gpu_addr);<br>
                        >> +        r = dma_fence_wait(fences[1],
                    false);<br>
                        >>           if (r) {<br>
                        >> -                dev_err(adev->dev,
                    "Init SGPRS Part2: failed to run shader\n");<br>
                        >> -                goto failed;<br>
                        >> +                dev_err(adev->dev,
                    "timeout to clear first 576 sgprs\n");<br>
                        >> +                goto disp1_failed;<br>
                        >>           }<br>
                        >>    <br>
                        >> -        r =
                    gfx_v9_4_2_check_gprs_init_coverage(adev, ib.ptr);<br>
                        >> +        memset(wb_ib.ptr, 0, (1 +
                    wb_size) * sizeof(uint32_t));<br>
                        >> +        r =
                    gfx_v9_4_2_run_shader(adev,<br>
                        >> +                       
                    &adev->gfx.compute_ring[0],<br>
                        >> +                       
                    &disp_ibs[2],<br>
                        >> +                       
                    sgpr64_init_compute_shader_aldebaran,<br>
                        >> +                       
                    sizeof(sgpr64_init_compute_shader_aldebaran),<br>
                        >> +                       
                    sgpr64_init_regs_aldebaran,<br>
                        >> +                       
                    ARRAY_SIZE(sgpr64_init_regs_aldebaran),<br>
                        >> +                       
                    adev->gfx.cu_info.number,<br>
                        >> +                       
                    wb_ib.gpu_addr, pattern[2], &fences[2]);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "failed to clear first 256 sgprs\n");<br>
                        >> +                goto disp1_failed;<br>
                        >> +        }<br>
                        >> +<br>
                        >> +        r = dma_fence_wait(fences[2],
                    false);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "timeout to clear first 256 sgprs\n");<br>
                        >> +                goto disp2_failed;<br>
                        >> +        }<br>
                        >> +<br>
                        >> +        r =
                    gfx_v9_4_2_wait_for_waves_assigned(adev,<br>
                        >> +                       
                    &wb_ib.ptr[1], 0b1111,<br>
                        >> +                        pattern[2],<br>
                        >> +                       
                    adev->gfx.cu_info.number * SIMD_ID_MAX * 4,<br>
                        >> +                        false);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "wave coverage failed when clear first 256
                    sgprs\n");<br>
                        >> +                goto disp2_failed;<br>
                        >> +        }<br>
                        >> +<br>
                        >> +disp2_failed:<br>
                        >> +        amdgpu_ib_free(adev,
                    &disp_ibs[2], NULL);<br>
                        >> +        dma_fence_put(fences[2]);<br>
                        >> +disp1_failed:<br>
                        >> +        amdgpu_ib_free(adev,
                    &disp_ibs[1], NULL);<br>
                        >> +        dma_fence_put(fences[1]);<br>
                        >> +disp0_failed:<br>
                        >> +        amdgpu_ib_free(adev,
                    &disp_ibs[0], NULL);<br>
                        >> +        dma_fence_put(fences[0]);<br>
                        >> +pro_end:<br>
                        >> +        amdgpu_ib_free(adev,
                    &wb_ib, NULL);<br>
                        >> +<br>
                        >>           if (r)<br>
                        >> -                dev_err(adev->dev,<br>
                        >> -                        "Init SGPRS:
                    failed to cover all SIMDs\n");<br>
                        >> +                dev_info(adev->dev,
                    "Init SGPRS Failed\n");<br>
                        >>           else<br>
                        >>                  
                    dev_info(adev->dev, "Init SGPRS Successfully\n");<br>
                        >>    <br>
                        >> -failed:<br>
                        >> -        amdgpu_ib_free(adev, &ib,
                    NULL);<br>
                        >>           return r;<br>
                        >>    }<br>
                        >>    <br>
                        >> +static int
                    gfx_v9_4_2_do_vgprs_init(struct amdgpu_device *adev)
                    {<br>
                        >> +        int r;<br>
                        >> +        /* CU_ID: 0~15, SIMD_ID: 0~3,
                    WAVE_ID: 0 ~ 9 */<br>
                        >> +        int wb_size =
                    adev->gfx.config.max_shader_engines *<br>
                        >> +                         CU_ID_MAX *
                    SIMD_ID_MAX * WAVE_ID_MAX;<br>
                        >> +        struct amdgpu_ib wb_ib;<br>
                        >> +        struct amdgpu_ib disp_ib;<br>
                        >> +        struct dma_fence *fence;<br>
                        >> +        u32 pattern = 0xa;<br>
                        >> +<br>
                        >> +        /* bail if the compute ring is
                    not ready */<br>
                        >> +        if
                    (!adev->gfx.compute_ring[0].sched.ready)<br>
                        >> +                return 0;<br>
                        >> +<br>
                        >> +        /* allocate the write-back
                    buffer from IB */<br>
                        >> +        memset(&wb_ib, 0,
                    sizeof(wb_ib));<br>
                        >> +        r = amdgpu_ib_get(adev, NULL,
                    (1 + wb_size) * sizeof(uint32_t),<br>
                        >> +                         
                    AMDGPU_IB_POOL_DIRECT, &wb_ib);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "failed to get ib (%d) for wb.\n", r);<br>
                        >> +                return r;<br>
                        >> +        }<br>
                        >> +        memset(wb_ib.ptr, 0, (1 +
                    wb_size) * sizeof(uint32_t));<br>
                        >> +<br>
                        >> +        r =
                    gfx_v9_4_2_run_shader(adev,<br>
                        >> +                       
                    &adev->gfx.compute_ring[0],<br>
                        >> +                        &disp_ib,<br>
                        >> +                       
                    vgpr_init_compute_shader_aldebaran,<br>
                        >> +                       
                    sizeof(vgpr_init_compute_shader_aldebaran),<br>
                        >> +                       
                    vgpr_init_regs_aldebaran,<br>
                        >> +                       
                    ARRAY_SIZE(vgpr_init_regs_aldebaran),<br>
                        >> +                       
                    adev->gfx.cu_info.number,<br>
                        >> +                       
                    wb_ib.gpu_addr, pattern, &fence);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "failed to clear vgprs\n");<br>
                        >> +                goto pro_end;<br>
                        >> +        }<br>
                        >> +<br>
                        >> +        /* wait for the GPU to finish
                    processing the IB */<br>
                        >> +        r = dma_fence_wait(fence,
                    false);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "timeout to clear vgprs\n");<br>
                        >> +                goto disp_failed;<br>
                        >> +        }<br>
                        >> +<br>
                        >> +        r =
                    gfx_v9_4_2_wait_for_waves_assigned(adev,<br>
                        >> +                       
                    &wb_ib.ptr[1], 0b1,<br>
                        >> +                        pattern,<br>
                        >> +                       
                    adev->gfx.cu_info.number * SIMD_ID_MAX,<br>
                        >> +                        false);<br>
                        >> +        if (r) {<br>
                        >> +                dev_err(adev->dev,
                    "failed to cover all simds when clearing vgprs\n");<br>
                        >> +                goto disp_failed;<br>
                        >> +        }<br>
                        >> +<br>
                        >> +disp_failed:<br>
                        >> +        amdgpu_ib_free(adev,
                    &disp_ib, NULL);<br>
                        >> +        dma_fence_put(fence);<br>
                        >> +pro_end:<br>
                        >> +        amdgpu_ib_free(adev,
                    &wb_ib, NULL);<br>
                        >> +<br>
                        >> +        if (r)<br>
                        >> +                dev_info(adev->dev,
                    "Init VGPRS Failed\n");<br>
                        >> +        else<br>
                        >> +                dev_info(adev->dev,
                    "Init VGPRS Successfully\n");<br>
                        >> +<br>
                        >> +        return r;<br>
                        >> +}<br>
                        >> +<br>
                        >> +int
                    gfx_v9_4_2_do_edc_gpr_workarounds(struct
                    amdgpu_device *adev) {<br>
                        >> +        /* only support when RAS is
                    enabled */<br>
                        >> +        if
                    (!amdgpu_ras_is_supported(adev,
                    AMDGPU_RAS_BLOCK__GFX))<br>
                        >> +                return 0;<br>
                        >> +<br>
                        >> +       
                    gfx_v9_4_2_do_sgprs_init(adev);<br>
                        >> +<br>
                        >> +       
                    gfx_v9_4_2_do_vgprs_init(adev);<br>
                        >> +<br>
                        >> +        return 0;<br>
                        >> +}<br>
                        >> +<br>
                        >>    static void
                    gfx_v9_4_2_query_sq_timeout_status(struct
                    amdgpu_device<br>
                        >> *adev);  static void
                    gfx_v9_4_2_reset_sq_timeout_status(struct<br>
                        >> amdgpu_device *adev);<br>
                        >>    <br>
                        >> @@ -479,8 +710,6 @@ void
                    gfx_v9_4_2_init_golden_registers(struct
                    amdgpu_device *adev,<br>
                        >>                            die_id);<br>
                        >>                   break;<br>
                        >>           }<br>
                        >> -<br>
                        >> -        return;<br>
                        >>    }<br>
                        >>    <br>
                        >>    void
                    gfx_v9_4_2_debug_trap_config_init(struct
                    amdgpu_device *adev,<br>
                        >> --<br>
                        >> 2.17.1<br>
                        >>
                    _______________________________________________<br>
                        >> amd-gfx mailing list<br>
                        >> <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a><br>
                        >> <a
href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist"
                      moz-do-not-send="true">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist</a><br>
                        >>
s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CHa<br>
                        >>
                    wking.Zhang%40amd.com%7C615b0281a59c45e99e1d08d9098f7581%7C3dd8961fe48<br>
                        >>
                    84e608e11a82d994e183d%7C0%7C0%7C637551334037259365%7CUnknown%7CTWFpbGZ<br>
                        >>
                    sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3<br>
                        >>
D%7C1000&amp;sdata=9lyDS%2Bf7Cr6gWK7Jw6o2LEXbmqHuYDYutOPWge2sAkM%3D&am<br>
                        >> p;reserved=0<br>
                    <br>
                        _______________________________________________<br>
                        amd-gfx mailing list<br>
                        <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a><br>
                        <a
href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Coak.zeng%40amd.com%7C34d3cfb6c4ee4969da4e08d909917e2c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637551342771006509%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8lNVazYDVOl3ASEqHC%2BxLoWBX%2FKh36SpxWtBnHqfVsY%3D&amp;reserved=0"
                      moz-do-not-send="true">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Coak.zeng%40amd.com%7C34d3cfb6c4ee4969da4e08d909917e2c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637551342771006509%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8lNVazYDVOl3ASEqHC%2BxLoWBX%2FKh36SpxWtBnHqfVsY%3D&amp;reserved=0</a></p>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>