[Intel-gfx] [PATCH v2] drm/i915: Force wmb() on using GTT io_mapping_map_wc
Daniel Vetter
daniel at ffwll.ch
Tue May 26 02:56:18 PDT 2015
On Tue, May 26, 2015 at 09:49:15AM +0100, Chris Wilson wrote:
> On Tue, May 26, 2015 at 10:01:24AM +0200, Daniel Vetter wrote:
> > On Mon, May 25, 2015 at 06:48:44PM +0100, Chris Wilson wrote:
> > > Since the advent of mmap(wc), where we reused the same cache domain for
> > > WC and GTT paths (oh, how I regret that double-edged advice), we need to
> > > be extra cautious when using GTT iomap_wc internally. Since userspace maybe
> > > modifying through the mmap(wc) and we then modify modifying through an
> > > aliased WC path through the GTT, those writes may overlap and not be
> > > visible to the other path. Easiest to trigger appears to be write the
> > > batch through mmap(wc) and then attempt to perform reloc the GTT,
> > > corruption quickly ensues.
> > >
> > > v2: Be stricter and do a full mb() in case we are reading through one
> > > path and about to write through the second path. Also, apply the
> > > barriers on transitioning into the GTT domain as well to cover the white
> > > lie asychronous cases.
> > >
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Akash Goel <akash.goel at intel.com>
> > > Cc: stable at vger.kernel.org
> >
> > There are a lot more mb() and wmb() in the gem code. I count execlist,
> > legacy rings, pwrite, set_domain and finish_gtt.
>
> You have more than I have...
Well yeah, but the patch is also cc: stable. Which means we'd need to frob
the barriers first, then refactor. And I'm still not sure whether this is
road we want to go down, or whether adding explicit begin/end coherency
ioctls wouldn't be the better long-term option. It always irked me a bit
that the kernel needs to infer the other edge of a cpu access, explicit
begin/end seemes better (and more in line with what all the gpu apis
expose). Something like this
struct i915_mmap_control {
#define BEGIN 0
#define END 1
#define READ 0
#define WRITE 2
#define UNSYNCHRONIZED 4 /* or whetever the bikeshed will be */
#define MAPPING_MASK 0xf0
#define GTT_WC (1 << 4)
#define CPU_WC (2 << 4)
#define CPU_CACHED (3 << 4)
u32 flags;
u32 bo;
u64 start;
u64 length;
};
Pretty much modelled after the begin/end_cpu_access hooks from dma-buf.
Which seems to gain some traction for a generic up/download interface ...
It just feels like incrementally fixing things might lead us to a very bad
place.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list