[PATCH] drm/amdgpu: Fix Manual Execution of Cleaner Shader in Gang Submissions
Christian König
christian.koenig at amd.com
Thu Mar 27 13:50:15 UTC 2025
Am 27.03.25 um 10:37 schrieb SRINIVASAN SHANMUGAM:
> On 3/27/2025 2:54 PM, Christian König wrote:
>>>>> Over all this change doesn't seem to make much sense to me.
>>>>> Why exactly is isolation->spearhead not pointing to the dummy kernel job we submit?
>>>> Does the owner check or gang_submit check in
>>>> amdgpu_device_enforce_isolation() fail to set up the spearhead?
>>> I'm currently debugging exactly that.
>>>
>>> Good news is that I can reproduce the problem.
>>
>> I have to take that back. I've tested the cleaner shader functionality a bit this morning and as far as I can see this works exactly as intended.
>>
>> Srini, what exactly is your use case which doesn't work?
>
> Hi Christian, Good Morning!
>
> The usecase is to trigger the cleaner shader, using sysfs "run_cleaner_shader" independent of enabling "enforce_isolation", so that cleaner shader packet gets submitted to COMP_1.0.0 ring by default, without prior enabling any enforce_isolation via sysfs,
>
I've tested exactly that and it seems to work perfectly fine:
kworker/u96:1-209 [020] ..... 86.655999: amdgpu_isolation: prev=0000000000000000, next=ffffffffffffffff
kworker/u96:1-209 [020] ..... 86.656190: amdgpu_cleaner_shader: ring=gfx_0.0.0, seqno=2
<...>-11 [022] ..... 150.607688: amdgpu_isolation: prev=ffffffffffffffff, next=0000000000000000
kworker/u96:0-11 [022] ..... 150.608228: amdgpu_cleaner_shader: ring=comp_1.0.0, seqno=2
kworker/u96:0-11 [022] ..... 150.620597: amdgpu_isolation: prev=0000000000000000, next=ffffffffffffffff
kworker/u96:0-11 [022] ..... 150.620624: amdgpu_cleaner_shader: ring=gfx_0.0.0, seqno=1527
The only thing which might be confusing is that when you issue the cleaner shader multiple times when the GPU is idle it would only run once.
But that should be easy to change if necessary.
Regards,
Christian.
> AFAIK, this "isolation->spearhead" initialization is not being takencare in this *path **"amdgpu_gfx_run_cleaner_shader -> amdgpu_gfx_run_cleaner_shader_job" (ie., when we trigger *cleaner shader, using sysfs "run_cleaner_shader"), and this check "*&job->base.s_fence->scheduled == isolation->spearhead;" * is having the problem ie., "*&job->base.s_fence->scheduled" address are is not matching with**"**isolation->spearhead" address, which results into zero & thus fails to emit cleaner shader, when running using "run_cleaner_shader" sysfs entry, **in "amdgpu_vm_flush()" function
> *
>
> Best regards,
>
> Srini
>
>>
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Christian.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250327/0085fdeb/attachment-0001.htm>
More information about the amd-gfx
mailing list