[Linaro-mm-sig] Re: [PATCH v5 1/6] dma-buf: Add dma_buf_{begin,end}_access()

Christian König christian.koenig at amd.com
Tue Jan 30 09:23:03 UTC 2024


Am 30.01.24 um 10:01 schrieb Daniel Vetter:
> On Fri, Jan 26, 2024 at 05:42:50PM +0100, Christian König wrote:
>> [SNIP]
>> Well I think we should have some solution, but I'm not sure if it should be
>> part of DMA-buf.
>>
>> Essentially a DMA-buf exports the buffers as he uses it and the importer (or
>> the DMA-buf subsystem) is then the one who says ok I can use this or I can't
>> use this or I need to call extra functions to use this or whatever.
>>
>> It's not the job of the exporter to provide the coherency for the importer,
>> cause otherwise we would have a lot of code in the exporter which can only
>> be tested when you have the right importer around. And I strongly think that
>> this is a no-go for having a reliable solution.
> The trouble is, that if you have other memory than stuff allocated by the
> dma-api or mapped using the dma-api, then by necessity the exporter has to
> deal with this.

Yes, I was thinking about that as well.

> Which is the exact same reason we also force the exporters to deal with
> the cpu cache flushing - you're argument that it's not great to replicate
> this everywhere holds there equally.

And I'm not really happy with that either.

> The other thing is that right now the exporter is the only one who
> actually knows what kind of dma coherency rules apply for a certain piece
> of memory. E.g. on i915-gem even if it's dma_map_sg mapped the underlying
> i915-gem buffer might be non-coherent, and i915-gem makes it all work by
> doing the appropriate amount of clflush.

Yeah, exactly that's the reason why I think that this stuff doesn't 
belong into exporters/drivers.

Looking at what kind of hacks and workarounds we have in both amdgpu as 
well as i915 it's pretty clear that we need to improve this design somehow.

> Similar funky things happen in other cases.
>
> So unless we add an interface which allows importers to figure out how
> much flushing is needed, currently the exporter is the only one who knows
> (because it can inspect the struct device at dma_buf_attach time).
>
> We could flip this around, but it would be a rather serious depature from
> the dma-buf design approach thus far.

Well clients already give the DMA-direction to exporters when creating 
the mapping and get an appropriate sg_table in return.

All we need to do is getting the information what flushing is needed 
into the object returned here so that the DMA API can work with it.

Christoph Hellwig pretty much nailed it when he said that the problem 
with the sg_table is that it mixes input and output parameters of the 
DMA-API.

I would extend that and say that we need a mapping object the DMA-API 
can work with so that it can know what needs to be done when devices 
request that data is made coherent between them or the CPU.

>> That's why I think the approach of having DMA-buf callbacks is most likely
>> the wrong thing to do.
>>
>> What should happen instead is that the DMA subsystem provides functionality
>> which to devices which don't support coherency through it's connection to
>> say I want to access this data, please make sure to flush the appropriate
>> catches. But that's just a very very rough design idea.
>>
>> This will become more with CXL at the horizon I think.
> Yeah CXL will make this all even more fun, but we are firmly there already
> with devices deciding per-buffer (or sometimes even per-access with
> intel's MOCS stuff) what coherency mode to use for a buffer.
>
> Also arm soc generally have both coherent and non-coherent device
> interconnects, and I think some devices can switch with runtime flags too
> which mode they use for a specific transition.
>
> CXL just extends this to pcie devices.
>
> So the mess is here, how do we deal with it?

I would say we start with the DMA-API by getting away from sg_tables to 
something cleaner and state oriented.

>
> My take is that the opt-in callback addition is far from great, but it's
> in line with how we extended dma-buf the past decade plus too. So unless
> someone's volunteering to pour some serious time into re-engineering this
> all (including testing all the different device/driver<->device/driver
> interactions) I think there's only really one other option: To not support
> these cases at all. And I don't really like that, because it means people
> will hack together something even worse in their drivers.
>
> By adding it to dma-buf it'll stare us in our faces at least :-/

Yeah, it's the way of the least resistance. But with CXL at the horizon 
and more and more drivers using it I think it's predictable that this 
will sooner or later blow up.

Cheers,
Christian.

>
> Cheers, Sima
>
>> Regards,
>> Christian.
>>
>>> Cheers, Sima
>> _______________________________________________
>> Linaro-mm-sig mailing list --linaro-mm-sig at lists.linaro.org
>> To unsubscribe send an email tolinaro-mm-sig-leave at lists.linaro.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20240130/115fa0ef/attachment.htm>


More information about the dri-devel mailing list