<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body> <div dir="auto"> <div>Hi Monk, <div dir="auto"><br> </div> <div dir="auto">because some parallel execution could load the GL2C.</div> <div dir="auto"><br> </div> <div dir="auto">See you need to insert cache invalidations before you start reading something which another engine has written.</div> <div dir="auto"><br> </div> <div dir="auto">And you need cache flushes to make sure that something your engine has written has reached memory before you signal finished execution.</div> <div dir="auto"><br> </div> <div dir="auto">That's perfectly normal cache handling what Marek is doing here.</div> <div dir="auto"><br> </div> <div dir="auto">Regards,</div> <div dir="auto">Christian.</div> <div class="gmail_extra"><br> <div class="gmail_quote">Am 29.04.2020 13:24 schrieb "Liu, Monk" <Monk.Liu@amd.com>:<br type="attribution"> <blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div> <div> <p>>> Well from my understanding I think that a G2LC invalidation is still necessary before an IB executes.</p> <p>Agree, I think before an IB executes the only thing we need on GL2C is the invalidation, not the flush .</p> <p><br> >> The problem is that the memory of the IB could also be cached because of some activity of the GFX or Compute rings.</p> <p>If we always insert a GL2C invalidate at every EOP of every IB from every engine, why we need a GL2C invalidate before IB execute ?</p> <div> <p>_____________________________________</p> <p><span style="font-size:12pt;background:white">Monk Liu|GPU Virtualization Team |</span><span style="font-size:12pt;color:#c82613;border:none 1pt;padding:0in;background:white">AMD</span></p> <p><img src="cid:image001.png@01D61E5B.D40DA6D0" alt="sig-cloud-gpu" style="width:0.8333in;height:0.8333in" height="80" width="80"></p> </div> <p> </p> <div> <div style="border:none;border-top:solid #e1e1e1 1pt;padding:3pt 0in 0in 0in"> <p><b>From:</b> Koenig, Christian <Christian.Koenig@amd.com> <br> <b>Sent:</b> Wednesday, April 29, 2020 5:38 PM<br> <b>To:</b> Liu, Monk <Monk.Liu@amd.com>; Marek Olšák <maraeo@gmail.com>; amd-gfx mailing list <amd-gfx@lists.freedesktop.org><br> <b>Subject:</b> Re: drm/amdgpu: invalidate L2 before SDMA IBs (on gfx10)</p> </div> </div> <p> </p> <div> <p>Well from my understanding I think that a G2LC invalidation is still necessary before an IB executes.<br> <br> The problem is that the memory of the IB could also be cached because of some activity of the GFX or Compute rings.<br> <br> Regards,<br> Christian.<br> <br> Am 29.04.20 um 11:35 schrieb Liu, Monk:</p> </div> <blockquote style="margin-top:5pt;margin-bottom:5pt"> <p>Here is the reason we should always insert a “sync mem” packet at the FENCE place of SDMA, not before IB emit.</p> <p> </p> <p>By always inserting “sync mem” in the FENCE place we can make sure:1</p> <ol start="1" type="1" style="margin-top:0in"> <li style="margin-left:0in">data is really flushed to system memory before CPU try to read it </li><li style="margin-left:0in">all the G2LC is invalidated by “sync mem”, thus in the next round SDMA IB, it won’t get staled data from G2LC cache </li></ol> <p> </p> <p>by inserting “sync mem” in prior to IB could only achieve : Avoid get staled data in g2lc during IB execution </p> <p> </p> <p>for GFX/COMPUTE ring since they have release_mem packet so it is inherently doing the G2LC flush and invalidate upon a fence signaled </p> <p> </p> <div> <p>_____________________________________</p> <p><span style="font-size:12pt;background:white">Monk Liu|GPU Virtualization Team |</span><span style="font-size:12pt;color:#c82613;border:none 1pt;padding:0in;background:white">AMD</span></p> <p><img src="cid:image001.png@01D61E5B.D40DA6D0" alt="sig-cloud-gpu" style="width:0.8333in;height:0.8333in" height="80" width="80"></p> </div> <p> </p> <div> <div style="border:none;border-top:solid #e1e1e1 1pt;padding:3pt 0in 0in 0in"> <p><b>From:</b> Liu, Monk <br> <b>Sent:</b> Wednesday, April 29, 2020 5:06 PM<br> <b>To:</b> 'Marek Olšák' <a href="mailto:maraeo@gmail.com"><maraeo@gmail.com></a>; amd-gfx mailing list <a href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Koenig, Christian <a href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br> <b>Subject:</b> RE: drm/amdgpu: invalidate L2 before SDMA IBs (on gfx10)</p> </div> </div> <p> </p> <p>Hi <a href="mailto:Christian.Koenig@amd.com"><span style="font-family:'calibri' , sans-serif;text-decoration:none">@Koenig, Christian</span></a> & Marek</p> <p> </p> <p>I still have some concerns regarding Marek’s patch, correct me if I’m wrong</p> <p> </p> <p>See that Marek put a SDMA_OP_GCR_REQ before emitting IB, to make sure SDMA won’t get stale cache data during the IB execution.</p> <p> </p> <p>But that “SDMA_OP_GCR_REQ” only invalidate/flush the GFXHUB’s G2LC cache right ? what if the memory is changed by MM or CPU (out side of GFXHUB) ?</p> <p> </p> <p>Can this “ SDMA_OP_GCR_REQ” force MMHUB or even CPU to flush their operation result from their cache to memory ??</p> <p> </p> <p>Besides, with my understanding the “EOP” of gfx ring is doing the thing of “invalidate/flush” L2 cache upon a fence signaled, so what we should do on SDMA5 is to insert this “SDMA_OP_GCR_REQ”</p> <p>Right before thee “emit_fence” of SDMA (this is what windows KMD do)</p> <p> </p> <p>thanks </p> <p>_____________________________________</p> <p><span style="font-size:12pt;background:white">Monk Liu|GPU Virtualization Team |</span><span style="font-size:12pt;color:#c82613;border:none 1pt;padding:0in;background:white">AMD</span></p> <p><img src="cid:image001.png@01D61E5B.D40DA6D0" alt="sig-cloud-gpu" style="width:0.8333in;height:0.8333in" height="80" width="80" border="0"></p> <p> </p> <p><b>From:</b> amd-gfx <<a href="mailto:amd-gfx-bounces@lists.freedesktop.org">amd-gfx-bounces@lists.freedesktop.org</a>> <b>On Behalf Of </b>Marek Ol?ák<br> <b>Sent:</b> Saturday, April 25, 2020 4:52 PM<br> <b>To:</b> amd-gfx mailing list <<a href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>><br> <b>Subject:</b> drm/amdgpu: invalidate L2 before SDMA IBs (on gfx10)</p> <p> </p> <div> <div> <p>This should fix SDMA hangs on gfx10.</p> </div> <div> <p> </p> </div> <div> <p>Marek</p> </div> </div> </blockquote> <p> </p> </div> </div> </blockquote> </div> <br> </div> </div> </div> </body> </html>