[Intel-gfx] [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

Fri Dec 3 14:26:34 UTC 2021

[Adding Daniel here as well]

Am 03.12.21 um 15:18 schrieb Thomas Hellström:
> [SNIP]
>> Well that's ok as well. My question is why does this single dma_fence
>> then shows up in the dma_fence_chain representing the whole
>> migration?
> What we'd like to happen during eviction is that we
>
> 1) await any exclusive- or moving fences, then schedule the migration
> blit. The blit manages its own GPU ptes. Results in a single fence.
> 2) Schedule unbind of any gpu vmas, resulting possibly in multiple
> fences.
> 3) Most but not all of the remaining resv shared fences will have been
> finished in 2) We can't easily tell which so we have a couple of shared
> fences left.

Stop, wait a second here. We are going a bit in circles.

Before you migrate a buffer, you *MUST* wait for all shared fences to 
complete. This is documented mandatory DMA-buf behavior.

Daniel and I have discussed that quite extensively in the last few month.

So how does it come that you do the blit before all shared fences are 
completed?

> 4) Add all fences resulting from 1) 2) and 3) into the per-memory-type
> dma-fence-chain.
> 5) hand the resulting dma-fence-chain representing the end of migration
> over to ttm's resource manager.
>
> Now this means we have a dma-fence-chain disguised as a dma-fence out
> in the wild, and it could in theory reappear as a 3) fence for another
> migration unless a very careful audit is done, or as an input to the
> dma-fence-array used for that single dependency.
>
>> That somehow doesn't seem to make sense because each individual step
>> of
>> the migration needs to wait for those dependencies as well even when
>> it
>> runs in parallel.
>>
>>> But that's not really the point, the point was that an (at least to
>>> me) seemingly harmless usage pattern, be it real or fictious, ends
>>> up
>>> giving you severe internal- or cross-driver headaches.
>> Yeah, we probably should document that better. But in general I don't
>> see much reason to allow mixing containers. The dma_fence_array and
>> dma_fence_chain objects have some distinct use cases and and using
>> them
>> to build up larger dependency structures sounds really questionable.
> Yes, I tend to agree to some extent here. Perhaps add warnings when
> adding a chain or array as an input to array and when accidently
> joining chains, and provide helpers for flattening if needed.

Yeah, that's probably a really good idea. Going to put it on my todo list.

Thanks,
Christian.

>
> /Thomas
>
>
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>
>