[RFC PATCH] drm: disable WC optimization for cache coherent devices on non-x86

Ard Biesheuvel ard.biesheuvel at linaro.org
Mon Jan 21 16:14:37 UTC 2019

On Mon, 21 Jan 2019 at 16:59, Christoph Hellwig <hch at infradead.org> wrote:
> On Mon, Jan 21, 2019 at 04:33:27PM +0100, Ard Biesheuvel wrote:
> > On Mon, 21 Jan 2019 at 16:07, Christoph Hellwig <hch at infradead.org> wrote:
> > >
> > > > +#include <linux/dma-noncoherent.h>
> > >
> > > This header is not for usage in device drivers, but purely for
> > > dma-mapping implementations!
> > >
> >
> > Is that documented anywhere?
> I'll add big fat comments.  But the fact that nothing is exported
> there should be a fairly big hint.

I don't follow. How do other header files 'export' things in a way
that this header doesn't?

> > > And even if something like this was valid to do, it would have to be
> > > a core function with an arch hook, and not hidden in a random driver.
> >
> > Why would it not be valid to do? Is it wrong for a driver to be aware
> > of whether a device is cache coherent or not?
> >
> > And in case it isn't, do you have an alternative suggestion on how to
> > fix this mess?
> For the write combine mappings we need a proper core API how instances
> can advertise the support.  One thing I want to do fairly soon is
> error checking of the attrs argument to dma_alloc_attrs - so if you
> pass in something unsupported it will give you back an error.

That sounds useful.

> It seems that isn't quite enough for the drm use case, so we might
> also need a way to query these features, but that really has to go
> through the usual dma layer abstraction as well and not hacked together
> in a driver based on an eduacted guestimate.

As far as I can tell, these drivers allocate DMA'able memory [in
ttm_tt_populate()] and subsequently create their own CPU mappings for
it, assuming that
a) the default is cache coherent, so vmap()ing those pages with
cacheable attributes works, and
b) telling the GPU to use NoSnoop attributes makes the accesses it
performs coherent with non-cacheable CPU mappings of those physical

Since the latter is not true for many arm64 systems, I need this patch
to get a working system.

So how do you propose we proceed to get this fixed? Does it have to
wait for this proper core API plus followup changes for DRM?

More information about the amd-gfx mailing list