[Intel-xe] [RFC PATCH 08/10] dma-buf/dma-fence: Introduce long-running completion fences

Tue Apr 4 18:14:01 UTC 2023

On 4/4/23 15:10, Christian König wrote:
> Am 04.04.23 um 14:54 schrieb Thomas Hellström:
>> Hi, Christian,
>>
>> On 4/4/23 11:09, Christian König wrote:
>>> Am 04.04.23 um 02:22 schrieb Matthew Brost:
>>>> From: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>>>>
>>>> For long-running workloads, drivers either need to open-code 
>>>> completion
>>>> waits, invent their own synchronization primitives or internally use
>>>> dma-fences that do not obey the cross-driver dma-fence protocol, but
>>>> without any lockdep annotation all these approaches are error prone.
>>>>
>>>> So since for example the drm scheduler uses dma-fences it is 
>>>> desirable for
>>>> a driver to be able to use it for throttling and error handling 
>>>> also with
>>>> internal dma-fences tha do not obey the cros-driver dma-fence 
>>>> protocol.
>>>>
>>>> Introduce long-running completion fences in form of dma-fences, and 
>>>> add
>>>> lockdep annotation for them. In particular:
>>>>
>>>> * Do not allow waiting under any memory management locks.
>>>> * Do not allow to attach them to a dma-resv object.
>>>> * Introduce a new interface for adding callbacks making the helper 
>>>> adding
>>>>    a callback sign off on that it is aware that the dma-fence may not
>>>>    complete anytime soon. Typically this will be the scheduler 
>>>> chaining
>>>>    a new long-running fence on another one.
>>>
>>> Well that's pretty much what I tried before: 
>>> https://lwn.net/Articles/893704/
>>>
>>> And the reasons why it was rejected haven't changed.
>>>
>>> Regards,
>>> Christian.
>>>
>> Yes, TBH this was mostly to get discussion going how we'd best tackle 
>> this problem while being able to reuse the scheduler for long-running 
>> workloads.
>>
>> I couldn't see any clear decision on your series, though, but one 
>> main difference I see is that this is intended for driver-internal 
>> use only. (I'm counting using the drm_scheduler as a helper for 
>> driver-private use). This is by no means a way to try tackle the 
>> indefinite fence problem.
>
> Well this was just my latest try to tackle this, but essentially the 
> problems are the same as with your approach: When we express such 
> operations as dma_fence there is always the change that we leak that 
> somewhere.
>
> My approach of adding a flag noting that this operation is dangerous 
> and can't be synced with something memory management depends on tried 
> to contain this as much as possible, but Daniel still pretty clearly 
> rejected it (for good reasons I think).
>
>>
>> We could ofc invent a completely different data-type that abstracts 
>> the synchronization the scheduler needs in the long-running case, or 
>> each driver could hack something up, like sleeping in the 
>> prepare_job() or run_job() callback for throttling, but those waits 
>> should still be annotated in one way or annotated one way or another 
>> (and probably in a similar way across drivers) to make sure we don't 
>> do anything bad.
>>
>>  So any suggestions as to what would be the better solution here 
>> would be appreciated.
>
> Mhm, do we really the the GPU scheduler for that?
>
> I mean in the 1 to 1 case  you basically just need a component which 
> collects the dependencies as dma_fence and if all of them are 
> fulfilled schedules a work item.
>
> As long as the work item itself doesn't produce a dma_fence it can 
> then still just wait for other none dma_fence dependencies.
>
> Then the work function could submit the work and wait for the result.
>
> The work item would then pretty much represent what you want, you can 
> wait for it to finish and pass it along as long running dependency.
>
> Maybe give it a funky name and wrap it up in a structure, but that's 
> basically it.
>
This very much sounds like a i915_sw_fence for the dependency tracking 
and dma_fence_work for the actual work although it's completion fence is 
a dma_fence.

Although that goes against the whole idea of a condition for merging the 
xe driver would be that we implement some sort of minimal scaffolding 
for long-running workloads in the drm scheduler, and the thinking behind 
that is to avoid implementing intel-specific solutions like those...

Thanks,

Thomas

> Regards,
> Christian.
>
>>
>> Thanks,
>>
>> Thomas
>>
>>
>>
>>
>>