[Intel-gfx] [PATCH] drm/i915: Only insert the mb() before updating the fence parameter

Chris Wilson chris at chris-wilson.co.uk
Tue Oct 9 13:03:13 CEST 2012


On Tue, 9 Oct 2012 11:54:12 +0200, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Tue, Oct 09, 2012 at 09:38:58AM +0100, Chris Wilson wrote:
> > @@ -3244,6 +3254,9 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  
> >  	i915_gem_object_flush_cpu_write_domain(obj);
> >  
> > +	if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0)
> > +		mb();
> > +
> 
> I think a comment here like we have one for all other gtt related memory
> barries would be good. Another thing is the flush_gtt_write_domain uses a
> wmb, whereas here we don't bother with micro-optimizing things. So I think
> it'd be good to just use a mb() for that, too, if just for consistency.

In fact, not only is that the wmb() the easiest to micro-optimise, it
is the only one that can be - I think. Around manipulating the fence, we
need a read/write barrier in case we have any pending accesses through
the fenced region, since the register write may be reordered passed the
memory reads since there is no obvious dependency. That might just be
heightened paranoia and our memory controller isn't that smart. Yet. So
those two need to be mb() so that I can sleep safely at night. For the
mb() inside set-to-gtt-domain, I don't have a robust explanation other
than that empirically we need a barrier, therefore there is some
lingering incoherency when reusing a bo. (The hangs always seem to occur
when crossing a page boundary, we see stale data.) You could attempt to
insert a read/write barrier depending upon actual usage, but it hardly
seems worth the effort.
 
> Also, you know the grumpy maintainer drill: Could we exercise these
> barriers with a minimal i-g-t testcase, please? Since you've managed to
> kill your machine by removing them, they're no longer just there to keep
> us happy, hence I'd like to have them exercised ...

Still hunting.
 
> Another thing that just crossed my mind: Could we lack a set of mb()s for
> cpu access on llc platforms? For non-coherent platforms the mb() in the
> clflush paths will do that, but on llc platforms I couldn't find anything.
> And that lp bugs seems to make an excellent case for them being required
> ...

Yes, with LLC we need to treat the GPU as another core and so put
similar SMP-esque memory barriers in place.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list