[PATCH 2/2] drm/nouveau: Don't signal when killing the fence context

Christian König christian.koenig at amd.com
Thu May 22 13:15:52 UTC 2025


On 5/22/25 14:57, Tvrtko Ursulin wrote:
> 
> On 22/05/2025 13:34, Christian König wrote:
>> On 5/22/25 14:20, Philipp Stanner wrote:
>>> On Thu, 2025-05-22 at 14:06 +0200, Christian König wrote:
>>>> On 5/22/25 13:25, Philipp Stanner wrote:
>>>>> dma_fence_is_signaled_locked(), which is used in
>>>>> nouveau_fence_context_kill(), can signal fences below the surface
>>>>> through a callback.
>>>>>
>>>>> There is neither need for nor use in doing that when killing a
>>>>> fence
>>>>> context.
>>>>>
>>>>> Replace dma_fence_is_signaled_locked() with
>>>>> __dma_fence_is_signaled(), a
>>>>> function which only checks, never signals.
>>>>
>>>> That is not a good approach.
>>>>
>>>> Having the __dma_fence_is_signaled() means that other would be
>>>> allowed to call it as well.
>>>>
>>>> But nouveau can do that here only because it knows that the fence was
>>>> issued by nouveau.
>>>>
>>>> What nouveau can to is to test the signaled flag directly, but that's
>>>> what you try to avoid as well.
>>>
>>> There's many parties who check the bit already.
>>>
>>> And if Nouveau is allowed to do that, one can just as well provide a
>>> wrapper for it.
>>
>> No, exactly that's what is usually avoided in cases like this here.
>>
>> See all the functions inside include/linux/dma-fence.h can be used by everybody. It's basically the public interface of the dma_fence object.
>>
>> So testing if a fence is signaled without calling the callback is only allowed by whoever implemented the fence.
>>
>> In other words nouveau can test nouveau fences, i915 can test i915 fences, amdgpu can test amdgpu fences etc... But if you have the wrapper that makes it officially allowed that nouveau starts testing i915 fences and that would be problematic.
> 
> But why? Say for example scheduler dependencies - why the scheduler couldn't ignore them at add time, but it can before trying to install a callback on them, and instead has to opportunistically signal someone else's fences?

We had cases where people tested the signaling status from time to time and were then surprised that the fence never signaled.

> I don't see it. But even if there is a reason, advantage of the helper is that it can document this at a centralised place.

Yeah, that is basically the only argument I can see which speaks in favor of that approach.

Regards,
Christian. 

> 
> Regards,
> 
> Tvrtko
> 
>>> That has the advantage of centralizing the responsibility and
>>> documenting it.
>>>
>>> P.
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> Signed-off-by: Philipp Stanner <phasta at kernel.org>
>>>>> ---
>>>>>   drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c
>>>>> b/drivers/gpu/drm/nouveau/nouveau_fence.c
>>>>> index d5654e26d5bc..993b3dcb5db0 100644
>>>>> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
>>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
>>>>> @@ -88,7 +88,7 @@ nouveau_fence_context_kill(struct
>>>>> nouveau_fence_chan *fctx, int error)
>>>>>         spin_lock_irqsave(&fctx->lock, flags);
>>>>>       list_for_each_entry_safe(fence, tmp, &fctx->pending, head)
>>>>> {
>>>>> -        if (error && !dma_fence_is_signaled_locked(&fence-
>>>>>> base))
>>>>> +        if (error && !__dma_fence_is_signaled(&fence-
>>>>>> base))
>>>>>               dma_fence_set_error(&fence->base, error);
>>>>>             if (nouveau_fence_signal(fence))
>>>>
>>>
>>
> 



More information about the dri-devel mailing list