[PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86
Daniel Vetter
daniel at ffwll.ch
Wed Jul 14 12:48:40 UTC 2021
On Wed, Jul 14, 2021 at 01:54:50PM +0200, Christian König wrote:
> Am 13.07.21 um 22:51 schrieb Daniel Vetter:
> > intel-gfx-ci realized that something is not quite coherent anymore on
> > some platforms for our i915+vgem tests, when I tried to switch vgem
> > over to shmem helpers.
> >
> > After lots of head-scratching I realized that I've removed calls to
> > drm_clflush. And we need those. To make this a bit cleaner use the
> > same page allocation tooling as ttm, which does internally clflush
> > (and more, as neeeded on any platform instead of just the intel x86
> > cpus i915 can be combined with).
> >
> > Unfortunately this doesn't exist on arm, or as a generic feature. For
> > that I think only the dma-api can get at wc memory reliably, so maybe
> > we'd need some kind of GFP_WC flag to do this properly.
>
> The problem is that this stuff is extremely architecture specific. So GFP_WC
> and GFP_UNCACHED are really what we should aim for in the long term.
>
> And as far as I know we have at least the following possibilities how it is
> implemented:
>
> * A fixed amount of registers which tells the CPU the caching behavior for a
> memory region, e.g. MTRR.
> * Some bits of the memory pointers used, e.g. you see the same memory at
> different locations with different caching attributes.
> * Some bits in the CPUs page table.
> * Some bits in a separate page table.
>
> On top of that there is the PCIe specification which defines non-cache
> snooping access as an extension.
Yeah dma-buf is extremely ill-defined even on x86 if you combine these
all. We just play a game of whack-a-mole with the cacheline dirt until
it's gone.
That's the other piece here, how do you even make sure that the page is
properly flushed and ready for wc access:
- easy case is x86 with clflush available pretty much everywhere (since
10+ years at least)
- next are cpus which have some cache flush instructions, but it's highly
cpu model specific
- next up is the same, but you absolutely have to make sure there's no
other mapping around anymore or the coherency fabric just dies
- and I'm pretty sure there's worse stuff where you defacto can only
allocate wc memory that's set aside at boot-up and that's all you ever
get.
Cheers, Daniel
> Mixing that with the CPU caching behavior gets you some really nice ways to
> break a driver. In general x86 seems to be rather graceful, but arm and
> PowerPC are easily pissed if you mess that up.
>
> > Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> > Cc: Christian König <christian.koenig at amd.com>
> > Cc: "Thomas Hellström" <thomas.hellstrom at linux.intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
> > Cc: Maxime Ripard <mripard at kernel.org>
> > Cc: Thomas Zimmermann <tzimmermann at suse.de>
> > Cc: David Airlie <airlied at linux.ie>
> > Cc: Daniel Vetter <daniel at ffwll.ch>
>
> Acked-by: Christian könig <christian.koenig at amd.com>
>
> Regards,
> Christian.
>
> > ---
> > drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++++++++++++++
> > 1 file changed, 14 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index 296ab1b7c07f..657d2490aaa5 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -10,6 +10,10 @@
> > #include <linux/slab.h>
> > #include <linux/vmalloc.h>
> > +#ifdef CONFIG_X86
> > +#include <asm/set_memory.h>
> > +#endif
> > +
> > #include <drm/drm.h>
> > #include <drm/drm_device.h>
> > #include <drm/drm_drv.h>
> > @@ -162,6 +166,11 @@ static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object *shmem)
> > return PTR_ERR(pages);
> > }
> > +#ifdef CONFIG_X86
> > + if (shmem->map_wc)
> > + set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
> > +#endif
> > +
> > shmem->pages = pages;
> > return 0;
> > @@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct drm_gem_shmem_object *shmem)
> > if (--shmem->pages_use_count > 0)
> > return;
> > +#ifdef CONFIG_X86
> > + if (shmem->map_wc)
> > + set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
> > +#endif
> > +
> > drm_gem_put_pages(obj, shmem->pages,
> > shmem->pages_mark_dirty_on_put,
> > shmem->pages_mark_accessed_on_put);
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the dri-devel
mailing list