[Intel-gfx] [PATCH v4 06/10] drm/i915: Implement LRI based FBC tracking

Thu Nov 21 17:33:21 CET 2013

On Thu, Nov 21, 2013 at 11:49:47AM +0000, Chris Wilson wrote:
> On Thu, Nov 21, 2013 at 01:14:10PM +0200, ville.syrjala at linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > 
> > As per the SNB and HSW PM guides, we should enable FBC render/blitter
> > tracking only during batches targetting the front buffer.
> > 
> > On SNB we must also update the FBC render tracking address whenever it
> > changes. And since the register in question is stored in the context,
> > we need to make sure we reload it with correct data after context
> > switches.
> > 
> > On IVB/HSW we use the render nuke mechanism, so no render tracking
> > address updates are needed. Hoever on the blitter side we need to
> > enable the blitter tracking like on SNB, and in addition we need
> > to issue the cache clean messages, which we already did.
> > 
> > v2: Introduce intel_fb_obj_has_fbc()
> >     Fix crtc locking around crtc->fb access
> >     Drop a hunk that was included by accident in v1
> >     Set fbc_address_dirty=false not true after emitting the LRI
> > v3: Now that fbc hangs on to the fb intel_fb_obj_has_fbc() doesn't
> >     need to upset lockdep anymore
> > v4: Use |= instead of = to update fbc_address_dirty
> 
> Ah, should we do the same for fbc_dirty? In the past we could dispatch a
> batchbuffer but fail to add the request (and so fail to flush the
> rendering/fbc). We currently preallocate the request so that failure
> path is history, but we will more than likely be caught out again in the
> future.

I guess we could do it for fbc_dirty as well, but I don't think that
should actually change anything. Either we're rendering to the FBC
scanout buffer, or we're not.

I did start pondering if I should actually move the fbc_address to live
under the context once that's where it actually belongs. If we'd track
it per-context we might be able to avoid emitting the LRI for every
context switch.

> 
> Like pc8, I'd like for a device (or crtc if you must) property whether
> or not indirect rendering is preferred.
> 
> Other than that and the bikeshed to kill the redundant fbc_obj local
> variable and pack the dirty bits, it looks good to me.

Actually I just fired it up for real on SNB and it failed. The problem
is that we end up doing the invalidate before the context switch. So
we've not yet forced fbc_address_dirty=true due to the context switch
when we do the invalidate. The most straightforward fix would be to
simply move i915_gem_execbuffer_move_to_gpu() to be called after
i915_switch_context(). I did that on my machine and now it passes my
context related FBC tests. But I do wonder if the order of these
operations was chose for a reason, and whether the reordering might
cause other problems.

-- 
Ville Syrjälä
Intel OTC