[Intel-gfx] [PATCH 02/18] drm/i915: Flush pages on acquisition
Chris Wilson
chris at chris-wilson.co.uk
Wed Mar 20 12:35:51 UTC 2019
Quoting Matthew Auld (2019-03-20 12:26:00)
> On Wed, 20 Mar 2019 at 11:48, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> >
> > Quoting Matthew Auld (2019-03-20 11:41:52)
> > > On Tue, 19 Mar 2019 at 11:58, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > > > @@ -2534,6 +2522,14 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
> > > >
> > > > lockdep_assert_held(&obj->mm.lock);
> > > >
> > > > + /* Make the pages coherent with the GPU (flushing any swapin). */
> > > > + if (obj->cache_dirty) {
> > > > + obj->write_domain = 0;
> > > > + if (i915_gem_object_has_struct_page(obj))
> > > > + drm_clflush_sg(pages);
> > > > + obj->cache_dirty = false;
> > > > + }
> > >
> > > Is it worth adding some special casing here for volatile objects, so
> > > that we avoid doing the clflush_sg every time we do set_pages for
> > > !llc?
> > >
> > > if (obj->cache_dirty && obj->mm.madvise == WILLNEED)
> > >
> > > Or is that meh?
> >
> > No, even for volatile objects we have to be careful with what remains in
> > the CPU cache as that may obscure updates to the underlying page. We see
> > the very same problem with speculative cacheline loading.
> >
> > A DONTNEED object should fail before it gets allocated pages :)
>
> I was talking about kernel internal objects, which are marked as
> DONTNEED just before we call set_pages(), and for that case it's
> surely up to the caller to flush things before they even think of
> doing the unpin(since it's volatile).
But those objects also become WILLNEED at that point, and may still need
to be flushed.
The cost of the extra flushes is a worry, but not enough for me to be
concerned about. I think the convention that get_pages == coherent on
gpu improves quite a bit of our internal rummaging around and prevents
the ABI nightmare of mmap_gtt/mmap_offset. Will this flush remain inside
set_pages()? No, I don't think it will as pushing it into the callers
outside of the mm.lock itself makes sense, but I didn't think that was
of paramount importance compared to the uABI and can be done later.
-Chris
More information about the Intel-gfx
mailing list