[Intel-gfx] [PATCH v4 06/10] drm/i915: Implement LRI based FBC tracking

Fri Nov 22 05:20:29 CET 2013

On Thu, Nov 21, 2013 at 04:39:43PM +0000, Chris Wilson wrote:
> On Thu, Nov 21, 2013 at 06:33:21PM +0200, Ville Syrjälä wrote:
> > On Thu, Nov 21, 2013 at 11:49:47AM +0000, Chris Wilson wrote:
> > > On Thu, Nov 21, 2013 at 01:14:10PM +0200, ville.syrjala at linux.intel.com wrote:
> > > > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > 
> > > > As per the SNB and HSW PM guides, we should enable FBC render/blitter
> > > > tracking only during batches targetting the front buffer.
> > > > 
> > > > On SNB we must also update the FBC render tracking address whenever it
> > > > changes. And since the register in question is stored in the context,
> > > > we need to make sure we reload it with correct data after context
> > > > switches.
> > > > 
> > > > On IVB/HSW we use the render nuke mechanism, so no render tracking
> > > > address updates are needed. Hoever on the blitter side we need to
> > > > enable the blitter tracking like on SNB, and in addition we need
> > > > to issue the cache clean messages, which we already did.
> > > > 
> > > > v2: Introduce intel_fb_obj_has_fbc()
> > > >     Fix crtc locking around crtc->fb access
> > > >     Drop a hunk that was included by accident in v1
> > > >     Set fbc_address_dirty=false not true after emitting the LRI
> > > > v3: Now that fbc hangs on to the fb intel_fb_obj_has_fbc() doesn't
> > > >     need to upset lockdep anymore
> > > > v4: Use |= instead of = to update fbc_address_dirty
> > > 
> > > Ah, should we do the same for fbc_dirty? In the past we could dispatch a
> > > batchbuffer but fail to add the request (and so fail to flush the
> > > rendering/fbc). We currently preallocate the request so that failure
> > > path is history, but we will more than likely be caught out again in the
> > > future.
> > 
> > I guess we could do it for fbc_dirty as well, but I don't think that
> > should actually change anything. Either we're rendering to the FBC
> > scanout buffer, or we're not.
> 
> The scenario I worry about here is the missing flush after the
> rendering. It has been possible for us to lose it (under memory
> pressure, other constaints etc) and to issue a catch-up flush on the
> next batch. Without the or, we'd lose the FBC flush. It is just a
> potential issue for the future.
>  
> > I did start pondering if I should actually move the fbc_address to live
> > under the context once that's where it actually belongs. If we'd track
> > it per-context we might be able to avoid emitting the LRI for every
> > context switch.
> > 
> > > 
> > > Like pc8, I'd like for a device (or crtc if you must) property whether
> > > or not indirect rendering is preferred.
> > > 
> > > Other than that and the bikeshed to kill the redundant fbc_obj local
> > > variable and pack the dirty bits, it looks good to me.
> > 
> > Actually I just fired it up for real on SNB and it failed. The problem
> > is that we end up doing the invalidate before the context switch. So
> > we've not yet forced fbc_address_dirty=true due to the context switch
> > when we do the invalidate. The most straightforward fix would be to
> > simply move i915_gem_execbuffer_move_to_gpu() to be called after
> > i915_switch_context(). I did that on my machine and now it passes my
> > context related FBC tests. But I do wonder if the order of these
> > operations was chose for a reason, and whether the reordering might
> > cause other problems.
> 
> Ben, opinion on whether the ordering was just convenience?
> -Chris

If it was anything but convenience, I've long since forgotten. I guess
as long as we do it in two patches we can always bisect, and it's all
good.

> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

-- 
Ben Widawsky, Intel Open Source Technology Center