<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
Am 27.03.25 um 10:37 schrieb SRINIVASAN SHANMUGAM:<br>
<blockquote type="cite" cite="mid:5a04ac1b-6b83-40c4-b9f1-ca42bd53763c@amd.com">
On 3/27/2025 2:54 PM, Christian König wrote:
<blockquote type="cite" cite="mid:fc461f19-44b8-4699-b3e6-c37e1b7dc76f@amd.com">
<blockquote type="cite" cite="mid:740940f4-055b-483b-88b7-072907539167@amd.com">
<blockquote type="cite">
<blockquote type="cite"> <span style="white-space: pre-wrap">Over all this change doesn't seem to make much sense to me.</span>
<pre class="moz-quote-pre" wrap="">Why exactly is isolation->spearhead not pointing to the dummy kernel job we submit?
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">Does the owner check or gang_submit check in
amdgpu_device_enforce_isolation() fail to set up the spearhead?
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">I'm currently debugging exactly that.
Good news is that I can reproduce the problem.</pre>
</blockquote>
<br>
I have to take that back. I've tested the cleaner shader
functionality a bit this morning and as far as I can see this
works exactly as intended.<br>
<br>
Srini, what exactly is your use case which doesn't work?<br>
</blockquote>
<p>Hi Christian, Good Morning!</p>
<p>The usecase is to trigger the cleaner shader, using sysfs
"run_cleaner_shader" independent of enabling
"enforce_isolation", so that cleaner shader packet gets
submitted to COMP_1.0.0 ring by default, without prior enabling
any enforce_isolation via sysfs, <br>
</p>
</blockquote>
<br>
I've tested exactly that and it seems to work perfectly fine:<br>
kworker/u96:1-209 [020] ..... 86.655999: amdgpu_isolation:
prev=0000000000000000, next=ffffffffffffffff<br>
kworker/u96:1-209 [020] ..... 86.656190:
amdgpu_cleaner_shader: ring=gfx_0.0.0, seqno=2<br>
<...>-11 [022] ..... 150.607688:
amdgpu_isolation: prev=ffffffffffffffff, next=0000000000000000<br>
kworker/u96:0-11 [022] ..... 150.608228:
amdgpu_cleaner_shader: ring=comp_1.0.0, seqno=2<br>
kworker/u96:0-11 [022] ..... 150.620597: amdgpu_isolation:
prev=0000000000000000, next=ffffffffffffffff<br>
kworker/u96:0-11 [022] ..... 150.620624:
amdgpu_cleaner_shader: ring=gfx_0.0.0, seqno=1527<br>
<br>
<br>
The only thing which might be confusing is that when you issue the
cleaner shader multiple times when the GPU is idle it would only run
once.<br>
<br>
But that should be easy to change if necessary.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<blockquote type="cite" cite="mid:5a04ac1b-6b83-40c4-b9f1-ca42bd53763c@amd.com">
<p>AFAIK, this "isolation->spearhead" initialization is not
being takencare in this <strong>path </strong><span data-teams="true"><strong>"amdgpu_gfx_run_cleaner_shader ->
amdgpu_gfx_run_cleaner_shader_job" (ie., when we trigger </strong></span>cleaner
shader, using sysfs "run_cleaner_shader"), and this check "<span data-teams="true"><strong>&job->base.s_fence->scheduled
== isolation->spearhead;" </strong></span> is having the
problem ie., "<span data-teams="true"><strong>&job->base.s_fence->scheduled"
address are is not matching with</strong></span><span data-teams="true"><strong> "</strong></span><span data-teams="true"><strong>isolation->spearhead" address,
which results into zero & thus fails to emit cleaner
shader, when running using "run_cleaner_shader" sysfs entry,
</strong></span><span data-teams="true"><strong>in
"amdgpu_vm_flush()" function<br>
</strong></span></p>
<p>Best regards,</p>
<p>Srini<br>
</p>
<blockquote type="cite" cite="mid:fc461f19-44b8-4699-b3e6-c37e1b7dc76f@amd.com"> <br>
Regards,<br>
Christian.<br>
<br>
<blockquote type="cite" cite="mid:740940f4-055b-483b-88b7-072907539167@amd.com">
<pre class="moz-quote-pre" wrap="">Regards,
Christian.
</pre>
</blockquote>
</blockquote>
</blockquote>
<br>
</body>
</html>