<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Well from my understanding I think that
a G2LC invalidation is still necessary before an IB executes.<br>
<br>
The problem is that the memory of the IB could also be cached
because of some activity of the GFX or Compute rings.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 29.04.20 um 11:35 schrieb Liu, Monk:<br>
</div>
<blockquote type="cite" cite="mid:DM5PR12MB170863A130B7FFDBC2511CE584AD0@DM5PR12MB1708.namprd12.prod.outlook.com">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:等线;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@等线";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle18
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:242179703;
mso-list-type:hybrid;
mso-list-template-ids:-495014770 67698705 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-text:"%1\)";
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">Here is the reason we should always insert
a “sync mem” packet at the FENCE place of SDMA, not before IB
emit.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">By always inserting “sync mem” in the FENCE
place we can make sure:1<o:p></o:p></p>
<ol style="margin-top:0in" start="1" type="1">
<li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo1">data is
really flushed to system memory before CPU try to read it
<o:p></o:p></li>
<li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo1">all the G2LC
is invalidated by “sync mem”, thus in the next round SDMA
IB, it won’t get staled data from G2LC cache
<o:p></o:p></li>
</ol>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">by inserting “sync mem” in prior to IB
could only achieve : Avoid get staled data in g2lc during IB
execution
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">for GFX/COMPUTE ring since they have
release_mem packet so it is inherently doing the G2LC flush
and invalidate upon a fence signaled
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">_____________________________________<o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black;background:white">Monk
Liu|GPU Virtualization Team |</span><span style="font-size:12.0pt;color:#C82613;border:none
windowtext 1.0pt;padding:0in;background:white">AMD<o:p></o:p></span></p>
<p class="MsoNormal"><img style="width:.8333in;height:.8333in" id="_x0000_i1026" src="cid:part1.B97C9ABF.D6613786@amd.com" alt="sig-cloud-gpu" class="" width="80" height="80"><o:p></o:p></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> Liu, Monk <br>
<b>Sent:</b> Wednesday, April 29, 2020 5:06 PM<br>
<b>To:</b> 'Marek Olšák' <a class="moz-txt-link-rfc2396E" href="mailto:maraeo@gmail.com"><maraeo@gmail.com></a>; amd-gfx
mailing list <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>;
Koenig, Christian <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
<b>Subject:</b> RE: drm/amdgpu: invalidate L2 before SDMA
IBs (on gfx10)<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi <a id="OWAAM709EE8E0E6054CD48698E878A98E9795" href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true">
<span style="font-family:"Calibri",sans-serif;text-decoration:none">@Koenig,
Christian</span></a> & Marek<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I still have some concerns regarding
Marek’s patch, correct me if I’m wrong<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">See that Marek put a SDMA_OP_GCR_REQ before
emitting IB, to make sure SDMA won’t get stale cache data
during the IB execution.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">But that “SDMA_OP_GCR_REQ” only
invalidate/flush the GFXHUB’s G2LC cache right ? what if the
memory is changed by MM or CPU (out side of GFXHUB) ?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Can this “ SDMA_OP_GCR_REQ” force MMHUB or
even CPU to flush their operation result from their cache to
memory ??<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Besides, with my understanding the “EOP” of
gfx ring is doing the thing of “invalidate/flush” L2 cache
upon a fence signaled, so what we should do on SDMA5 is to
insert this “SDMA_OP_GCR_REQ”<o:p></o:p></p>
<p class="MsoNormal">Right before thee “emit_fence” of SDMA
(this is what windows KMD do)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">thanks <o:p></o:p></p>
<p class="MsoNormal">_____________________________________<o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black;background:white">Monk
Liu|GPU Virtualization Team |</span><span style="font-size:12.0pt;color:#C82613;border:none windowtext
1.0pt;padding:0in;background:white">AMD<o:p></o:p></span></p>
<p class="MsoNormal"><img style="width:.8333in;height:.8333in" id="Picture_x0020_1" src="cid:part1.B97C9ABF.D6613786@amd.com" alt="sig-cloud-gpu" class="" width="80" height="80" border="0"><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><b>From:</b> amd-gfx <<a href="mailto:amd-gfx-bounces@lists.freedesktop.org" moz-do-not-send="true">amd-gfx-bounces@lists.freedesktop.org</a>>
<b>On Behalf Of </b>Marek Ol?ák<br>
<b>Sent:</b> Saturday, April 25, 2020 4:52 PM<br>
<b>To:</b> amd-gfx mailing list <<a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>><br>
<b>Subject:</b> drm/amdgpu: invalidate L2 before SDMA IBs (on
gfx10)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">This should fix SDMA hangs on gfx10.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Marek<o:p></o:p></p>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>