[Intel-gfx] [PATCH v4 06/10] drm/i915: Implement LRI based FBC tracking

Thu Nov 21 17:39:43 CET 2013

On Thu, Nov 21, 2013 at 06:33:21PM +0200, Ville Syrjälä wrote:
> On Thu, Nov 21, 2013 at 11:49:47AM +0000, Chris Wilson wrote:
> > On Thu, Nov 21, 2013 at 01:14:10PM +0200, ville.syrjala at linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > 
> > > As per the SNB and HSW PM guides, we should enable FBC render/blitter
> > > tracking only during batches targetting the front buffer.
> > > 
> > > On SNB we must also update the FBC render tracking address whenever it
> > > changes. And since the register in question is stored in the context,
> > > we need to make sure we reload it with correct data after context
> > > switches.
> > > 
> > > On IVB/HSW we use the render nuke mechanism, so no render tracking
> > > address updates are needed. Hoever on the blitter side we need to
> > > enable the blitter tracking like on SNB, and in addition we need
> > > to issue the cache clean messages, which we already did.
> > > 
> > > v2: Introduce intel_fb_obj_has_fbc()
> > >     Fix crtc locking around crtc->fb access
> > >     Drop a hunk that was included by accident in v1
> > >     Set fbc_address_dirty=false not true after emitting the LRI
> > > v3: Now that fbc hangs on to the fb intel_fb_obj_has_fbc() doesn't
> > >     need to upset lockdep anymore
> > > v4: Use |= instead of = to update fbc_address_dirty
> > 
> > Ah, should we do the same for fbc_dirty? In the past we could dispatch a
> > batchbuffer but fail to add the request (and so fail to flush the
> > rendering/fbc). We currently preallocate the request so that failure
> > path is history, but we will more than likely be caught out again in the
> > future.
> 
> I guess we could do it for fbc_dirty as well, but I don't think that
> should actually change anything. Either we're rendering to the FBC
> scanout buffer, or we're not.

The scenario I worry about here is the missing flush after the
rendering. It has been possible for us to lose it (under memory
pressure, other constaints etc) and to issue a catch-up flush on the
next batch. Without the or, we'd lose the FBC flush. It is just a
potential issue for the future.

> I did start pondering if I should actually move the fbc_address to live
> under the context once that's where it actually belongs. If we'd track
> it per-context we might be able to avoid emitting the LRI for every
> context switch.
> 
> > 
> > Like pc8, I'd like for a device (or crtc if you must) property whether
> > or not indirect rendering is preferred.
> > 
> > Other than that and the bikeshed to kill the redundant fbc_obj local
> > variable and pack the dirty bits, it looks good to me.
> 
> Actually I just fired it up for real on SNB and it failed. The problem
> is that we end up doing the invalidate before the context switch. So
> we've not yet forced fbc_address_dirty=true due to the context switch
> when we do the invalidate. The most straightforward fix would be to
> simply move i915_gem_execbuffer_move_to_gpu() to be called after
> i915_switch_context(). I did that on my machine and now it passes my
> context related FBC tests. But I do wonder if the order of these
> operations was chose for a reason, and whether the reordering might
> cause other problems.

Ben, opinion on whether the ordering was just convenience?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre