[Intel-gfx] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

Thomas Hellström thomas.hellstrom at linux.intel.com
Tue Nov 30 19:27:49 UTC 2021


On 11/30/21 19:12, Thomas Hellström wrote:
> On Tue, 2021-11-30 at 16:02 +0100, Christian König wrote:
>> Am 30.11.21 um 15:35 schrieb Thomas Hellström:
>>> On Tue, 2021-11-30 at 14:26 +0100, Christian König wrote:
>>>> Am 30.11.21 um 13:56 schrieb Thomas Hellström:
>>>>> On 11/30/21 13:42, Christian König wrote:
>>>>>> Am 30.11.21 um 13:31 schrieb Thomas Hellström:
>>>>>>> [SNIP]
>>>>>>>> Other than that, I didn't investigate the nesting fails
>>>>>>>> enough to
>>>>>>>> say I can accurately review this. :)
>>>>>>> Basically the problem is that within enable_signaling()
>>>>>>> which
>>>>>>> is
>>>>>>> called with the dma_fence lock held, we take the dma_fence
>>>>>>> lock
>>>>>>> of
>>>>>>> another fence. If that other fence is a dma_fence_array, or
>>>>>>> a
>>>>>>> dma_fence_chain which in turn tries to lock a
>>>>>>> dma_fence_array
>>>>>>> we hit
>>>>>>> a splat.
>>>>>> Yeah, I already thought that you constructed something like
>>>>>> that.
>>>>>>
>>>>>> You get the splat because what you do here is illegal, you
>>>>>> can't
>>>>>> mix
>>>>>> dma_fence_array and dma_fence_chain like this or you can end
>>>>>> up
>>>>>> in a
>>>>>> stack corruption.
>>>>> Hmm. Ok, so what is the stack corruption, is it that the
>>>>> enable_signaling() will end up with endless recursion? If so,
>>>>> wouldn't
>>>>> it be more usable we break that recursion chain and allow a
>>>>> more
>>>>> general use?
>>>> The problem is that this is not easily possible for
>>>> dma_fence_array
>>>> containers. Just imagine that you drop the last reference to the
>>>> containing fences during dma_fence_array destruction if any of
>>>> the
>>>> contained fences is another container you can easily run into
>>>> recursion
>>>> and with that stack corruption.
>>> Indeed, that would require some deeper surgery.
>>>
>>>> That's one of the major reasons I came up with the
>>>> dma_fence_chain
>>>> container. This one you can chain any number of elements together
>>>> without running into any recursion.
>>>>
>>>>> Also what are the mixing rules between these? Never use a
>>>>> dma-fence-chain as one of the array fences and never use a
>>>>> dma-fence-array as a dma-fence-chain fence?
>>>> You can't add any other container to a dma_fence_array, neither
>>>> other
>>>> dma_fence_array instances nor dma_fence_chain instances.
>>>>
>>>> IIRC at least technically a dma_fence_chain can contain a
>>>> dma_fence_array if you absolutely need that, but Daniel, Jason
>>>> and I
>>>> already had the same discussion a while back and came to the
>>>> conclusion
>>>> to avoid that as well if possible.
>>> Yes, this is actually the use-case. But what I can't easily
>>> guarantee
>>> is that that dma_fence_chain isn't fed into a dma_fence_array
>>> somewhere
>>> else. How do you typically avoid that?
>>>
>>> Meanwhile I guess I need to take a different approach in the driver
>>> to
>>> avoid this altogether.
>> Jason and I came up with a deep dive iterator for his use case, but I
>> think we don't want to use that any more after my dma_resv rework.
>>
>> In other words when you need to create a new dma_fence_array you
>> flatten
>> out the existing construct which is at worst case
>> dma_fence_chain->dma_fence_array->dma_fence.
> Ok, Are there any cross-driver contract here, Like every driver using a
> dma_fence_array need to check for dma_fence_chain and flatten like
> above?
>
> /Thomas

Oh, and a follow up question:

If there was a way to break the recursion on final put() (using the same 
basic approach as patch 2 in this series uses to break recursion in 
enable_signaling()), so that none of these containers did require any 
special treatment, would it be worth pursuing? I guess it might be 
possible by having the callbacks drop the references rather than the 
loop in the final put. + a couple of changes in code iterating over the 
fence pointers.

/Thomas

>
>> Regards,
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> But I'll update the commit message with a typical splat.
>>>>>>>
>>>>>>> /Thomas


More information about the Intel-gfx mailing list