[Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)
Christian König
ckoenig.leichtzumerken at gmail.com
Mon Jun 21 10:16:55 UTC 2021
Am 18.06.21 um 20:45 schrieb Daniel Vetter:
> On Fri, Jun 18, 2021 at 8:02 PM Christian König
> <christian.koenig at amd.com> wrote:
>> Am 18.06.21 um 19:20 schrieb Daniel Vetter:
>> [SNIP]
>> The whole thing was introduced with this commit here:
>>
>> commit f2c24b83ae90292d315aa7ac029c6ce7929e01aa
>> Author: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>> Date: Wed Apr 2 17:14:48 2014 +0200
>>
>> drm/ttm: flip the switch, and convert to dma_fence
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>>
>> int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
>> ....
>> - bo->sync_obj = driver->sync_obj_ref(sync_obj);
>> + reservation_object_add_excl_fence(bo->resv, fence);
>> if (evict) {
>>
>> Maarten replaced the bo->sync_obj reference with the dma_resv exclusive
>> fence.
>>
>> This means that we need to apply the sync_obj semantic to all drivers
>> using a DMA-buf with its dma_resv object, otherwise you break imports
>> from TTM drivers.
>>
>> Since then and up till now the exclusive fence must be waited on and
>> never replaced with anything which signals before the old fence.
>>
>> Maarten and I think Thomas did that and I was always assuming that you
>> know about this design decision.
> Surprisingly I do actually know this.
>
> Still the commit you cite did _not_ change any of the rules around
> dma_buf: Importers have _no_ obligation to obey the exclusive fence,
> because the buffer is pinned. None of the work that Maarten has done
> has fundamentally changed this contract in any way.
Well I now agree that the rules around dma_resv are different than I
thought, but this change should have raised more eyebrows.
The problem is this completely broke interop with all drivers using TTM
and I think might even explain some bug reports.
I re-introduced the moving fence by adding bo->moving a few years after
the initial introduction of dma_resv, but that was just to work around
performance problems introduced by using the exclusive fence for both
use cases.
> If amdgpu (or any other ttm based driver) hands back and sgt without
> waiting for ttm_bo->moving or the exclusive fence first, then that's a
> bug we need to fix in these drivers. But if ttm based drivers did get
> this wrong, then they got this wrong both before and after the switch
> over to using dma_resv - this bug would go back all the way to Dave's
> introduction of drm_prime.c and support for that.
I'm not 100% sure, but I think before the switch to the dma_resv object
drivers just waited for the BOs to become idle and that should have
prevented this.
Anyway let's stop discussing history and move forward. Sending patches
for all affected TTM driver with CC: stable tags in a minute.
> The only thing which importers have to do is not wreak the DAG nature
> of the dma_resv fences and drop dependencies. Currently there's a
> handful of drivers which break this (introduced over the last few
> years), and I have it somewhere on my todo list to audit&fix them all.
Please give that some priority.
Ignoring the moving fence is a information leak, but messing up the DAG
gives you access to freed up memory.
> The goal with extracting dma_resv from ttm was to make implicit sync
> working and get rid of some terrible stalls on the userspace side.
> Eventually it was also the goal to make truly dynamic buffer
> reservation possible, but that took another 6 or so years to realize
> with your work. And we had to make dynamic dma-buf very much opt-in,
> because auditing all the users is very hard work and no one
> volunteered. And for dynamic dma-buf the rule is that the exclusive
> fence must _never_ be ignored, and the two drivers supporting it (mlx5
> and amdgpu) obey that.
>
> So yeah for ttm drivers dma_resv is primarily for memory management,
> with a side effect of also supporting implicit sync.
>
> For everyone else (and this includes a pile of render drivers, all the
> atomic kms drivers, v4l and I have no idea what else on top) dma_resv
> was only ever about implicit sync, and it can be ignored. And it (the
> implicit sync side) has to be ignored to be able to support vulkan
> winsys buffers correctly without stalling where we shouldn't. Also we
> have to ignore it on atomic kms side too (and depending upon whether
> writeback is supported atomic kms is perfectly capable of reading out
> any buffer passed to it).
Oh! That might actually explain some issues, but that just completely
breaks when TTM based drivers use atomic.
In other words for the first use is actually rather likely for TTM based
drivers to need to move the buffer around so that scanout is possible.
And that in turn means you need to wait for this move to finish even if
you have an explicit fence to wait for. IIRC amdgpu rolled its own
implementation of this and radeon doesn't have atomic, but nouveau is
most like broken.
So we do need a better solution for this sooner or later.
>> It's absolutely not that this is my invention, I'm just telling you how
>> it ever was.
>>
>> Anyway this means we have a seriously misunderstanding and yes now some
>> of our discussions about dynamic P2P suddenly make much more sense.
> Yeah I think at least we finally managed to get this across.
>
> Anyway I guess w/e for me now, otherwise we'll probably resort to
> throwing chairs :-) I'm dearly hoping the thunderstorms all around me
> actually get all the way to me, because it's way, way too hot here
> right now.
Well it's probably rather Dave or Linus who might start to throw chairs
at us to not getting this straight sooner.
At least the weather is getting more durable.
Cheers,
Christian.
>
> Cheers, Daniel
>
>> Regards,
>> Christian.
>>
>>
>>> _only_ when you have a dynamic importer/exporter can you assume that
>>> the dma_resv fences must actually be obeyed. That's one of the reasons
>>> why we had to make this a completely new mode (the other one was
>>> locking, but they really tie together).
>>>
>>> Wrt your problems:
>>> a) needs to be fixed in drivers exporting buffers and failing to make
>>> sure the memory is there by the time dma_buf_map_attachment returns.
>>> b) needs to be fixed in the importers, and there's quite a few of
>>> those. There's more than i915 here, which is why I think we should
>>> have the dma_resv_add_shared_exclusive helper extracted from amdgpu.
>>> Avoids hand-rolling this about 5 times (6 if we include the import
>>> ioctl from Jason).
>>>
>>> Also I've like been trying to explain this ever since the entire
>>> dynamic dma-buf thing started.
>>> -Daniel
>
More information about the dri-devel
mailing list