Tackling the indefinite/user DMA fence problem

Michel Dänzer michel.daenzer at mailbox.org
Wed May 25 13:28:41 UTC 2022


On 2022-05-25 15:05, Daniel Vetter wrote:
> On Tue, May 17, 2022 at 12:28:17PM +0200, Christian König wrote:
>> Am 09.05.22 um 16:10 schrieb Daniel Vetter:
>>> On Mon, May 09, 2022 at 08:56:41AM +0200, Christian König wrote:
>>>> Am 04.05.22 um 12:08 schrieb Daniel Vetter:
>>>>>
>>>>> If the goal is specifically atomic kms, then there's an entire can of
>>>>> worms there that I really don't want to think about, but it exists: We
>>>>> have dma_fence as out-fences from atomic commit, and that's already
>>>>> massively broken since most drivers allocate some memory or at least take
>>>>> locks which can allocate memory in their commit path. Like i2c. Putting a
>>>>> userspace memory fence as in-fence in there makes that problem
>>>>> substantially worse, since at least in theory you're just not allowed to
>>>>> might_faul in atomic_commit_tail.
>>>> Yes, that's unfortunately one of the goals as well and yes I completely
>>>> agree on the can of worms. But I think I've solved that.
>>>>
>>>> What I do in the patch set is to enforce that the out fence is an user fence
>>>> when the driver supports user in fences as well.
>>>>
>>>> Since user fences doesn't have the memory management dependency drivers can
>>>> actually allocate memory or call I2C functions which takes locks which have
>>>> memory allocation dependencies.
>>>>
>>>> Or do I miss some other reason why you can't fault or allocate memory in
>>>> atomic_commit_tail? At least lockdep seems to be happy about that now.
>>> The problem is a bit that this breaks the uapi already. At least if the
>>> goal is to have this all be perfectly transparent for userspace - as you
>>> as you have multi-gpu setups going on at least.
>>
>> Question here is why do you think there is an UAPI break? We currently wait
>> in a work item already, so where exactly is the problem?
> 
> It's a bit washy, but dma_fence and hence implicit sync is supposed to
> finish in finite time. umf just doesn't.
> 
> Ofc in reality you can still flood your compositor and they're not very
> robust, but with umf it's trivial to just hang your compositor forever and
> nothing happens.

You can add that to the list of reasons why compositors need to stop using buffers with unsignaled fences. There's plenty of other reasons there already (the big one being that otherwise slow clients can slow down the compositor, even if the compositor uses a high priority context and the HW supports preemption).


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer


More information about the dri-devel mailing list