[PATCH] drm: gem: add an option for supporting the dma-coherent hardware.
Maxime Ripard
mripard at kernel.org
Thu Jun 8 07:39:45 UTC 2023
On Thu, Jun 08, 2023 at 01:18:38AM +0800, Sui Jingfeng wrote:
> Hi,
>
> On 2023/6/8 00:12, Paul Cercueil wrote:
> > Hi Sui,
> >
> > Le mercredi 07 juin 2023 à 22:38 +0800, Sui Jingfeng a écrit :
> > > Hi, welcome to discussion.
> > >
> > >
> > > I have limited skills in manipulating English.
> > >
> > > It may not express what I'm really means in the short time.
> > >
> > > Part of word in the sentence may not as accurate as your.
> > >
> > > Well, please don't misunderstand, I'm not doing the rude to you.
> > No problem.
> >
> > > I will explain it with more details.
> > >
> > > See below:
> > >
> > >
> > > On 2023/6/7 20:09, Paul Cercueil wrote:
> > > > Hi Sui,
> > > >
> > > > Le mercredi 07 juin 2023 à 18:30 +0800, Sui Jingfeng a écrit :
> > > > > Hi,
> > > > >
> > > > >
> > > > > On 2023/6/7 17:36, Paul Cercueil wrote:
> > > > > > Hi Sui,
> > > > > >
> > > > > > Le mercredi 07 juin 2023 à 13:30 +0800, Sui Jingfeng a écrit :
> > > > > > > The single map_noncoherent member of struct
> > > > > > > drm_gem_dma_object
> > > > > > > may
> > > > > > > not
> > > > > > > sufficient for describing the backing memory of the GEM
> > > > > > > buffer
> > > > > > > object.
> > > > > > >
> > > > > > > Especially on dma-coherent systems, the backing memory is
> > > > > > > both
> > > > > > > cached
> > > > > > > coherent for multi-core CPUs and dma-coherent for peripheral
> > > > > > > device.
> > > > > > > Say architectures like X86-64, LoongArch64, Loongson Mips64,
> > > > > > > etc.
> > > > > > >
> > > > > > > Whether a peripheral device is dma-coherent or not can be
> > > > > > > implementation-dependent. The single map_noncoherent option
> > > > > > > is
> > > > > > > not
> > > > > > > enough
> > > > > > > to reflect real hardware anymore. For example, the Loongson
> > > > > > > LS3A4000
> > > > > > > CPU
> > > > > > > and LS2K2000/LS2K1000 SoC, peripheral device of such hardware
> > > > > > > platform
> > > > > > > allways snoop CPU's cache. Doing the allocation with
> > > > > > > dma_alloc_coherent
> > > > > > > function is preferred. The return buffer is cached, it should
> > > > > > > not
> > > > > > > using
> > > > > > > the default write-combine mapping. While with the current
> > > > > > > implement,
> > > > > > > there
> > > > > > > no way to tell the drm core to reflect this.
> > > > > > >
> > > > > > > This patch adds cached and coherent members to struct
> > > > > > > drm_gem_dma_object.
> > > > > > > which allow driver implements to inform the core. Introducing
> > > > > > > new
> > > > > > > mappings
> > > > > > > while keeping the original default behavior unchanged.
> > > > > > Did you try to simply set the "dma-coherent" property to the
> > > > > > device's
> > > > > > node?
> > > > > But this approach can only be applied for the device driver with
> > > > > DT
> > > > > support.
> > > > >
> > > > > X86-64, Loongson ls3a4000 mips64, Loongson ls3a5000 CPU typically
> > > > > do
> > > > > not
> > > > > have DT support.
> > > > >
> > > > > They using ACPI to pass parameter from the firmware to Linux
> > > > > kernel.
> > > > >
> > > > > You approach will lost the effectiveness on such a case.
> > > > Well, I don't really know how ACPI handles it - but it should just
> > > > be a
> > > > matter of setting dev->dma_coherent. That's basically what the DT
> > > > code
> > > > does.
> > > >
> > > > Some MIPS boards set it in their setup code for instance.
> > > >
> > > This is a *strategy*, not a *mechanism*.
> > >
> > > In this case, DT is just used to describing the hardware.
> > >
> > > (It is actually a hardware feature describing language, the
> > > granularity
> > > is large)
> > >
> > > It does not changing the state of the hardware.
> > >
> > > It's your platform firmware or kernel setting up code who actually do
> > > such a things.
> > >
> > >
> > > It's just that it works on *one* platform, it does not guarantee it
> > > will
> > > works on others.
> > If you add the "dma-coherent" property in a device node in DT, you
> > effectively specify that the device is DMA-coherent; so you describe
> > the hardware, which is what DT is for, and you are not changing the
> > state of the hardware.
> >
> > Note that some MIPS platforms (arch/mips/alchemy/common/setup.c)
> > default to DMA-coherent mapping; I believe you could do something
> > similar with your Loongson LS3A4000 CPU and LS2K2000/LS2K1000 SoC.
> >
> The preblem is that device driver can have various demand.
>
> It probably want to create different kind of buffers for different thing
> simultaneously.
>
> Say, one allocated with dma_alloc_coherent for command buffer or dma
> descriptor
>
> another one allocated with dma_alloc_wc for uploading shader etc.
>
> also has the third one allocated with dma_alloc_noncoherent() for doing some
> else.
And it will work just fine.
struct device dma_coherent, or DT's dma-coherent property define that
the device doesn't need any kind of cache maintenance, ever. If it's
missing, we need to perform cache maintenance to keep coherency.
dma_alloc_* functions provide guarantees to the driver. With
dma_alloc_wc and dma_alloc_coherent, the buffer is coherent, and thus
you don't need to perform cache maintenance operations by hand in the
driver.
With dma_alloc_noncoherent, the buffer is non-coherent and the driver
needs to perform them when relevant.
How those buffers are created is platform specific, but the guarantees
provided *to the driver* are always there.
A buffer allocated with dma_alloc_coherent might be provided by
different means (at the hardware level with a coherency unit, by mapping
it non-cacheable), but as far as the driver is concerned it's always
going to be coherent.
Similarly, a driver using dma_alloc_noncoherent will always require
cache maintenance operations to use the API properly, even if the
hardware provides coherency (in which case, those operations will be
nop).
So, yeah, like I was saying in the other mail, it looks like you're
confusing a bunch of things. dma_alloc_* functions are about the driver
expectations and guarantees. DT's dma-coherent property is about how we
can implement them on a given platform.
They don't have to match, and that's precisely how we can have drivers
that run on any combination of platforms: the driver only cares about
the buffer guarantees, the platform description takes care of how they
are implemented.
Maxime
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20230608/8b499ac2/attachment.sig>
More information about the dri-devel
mailing list