[Intel-gfx] [PATCH 2/2] drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.

Paulo Zanoni przanoni at gmail.com
Wed Sep 17 22:49:35 CEST 2014


2014-09-05 17:57 GMT-03:00 Rodrigo Vivi <rodrigo.vivi at intel.com>:
> The sw cache clean on gen8 is a tempoorary workaround because we cannot
> set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw.
> However we are doing much more than needed. Not only when using blt ring.
> So, with this extra w/a we minimize the ammount of cache cleans and call it only
> on same cases that it was being called on gen7.
>
> fbc.need_sw_cache_clean works in the opposite information direction
> of ring->fbc_dirty telling software on frontbuffer tracking to perform
> the cache clean on sw side.
>
> Cc: Paulo Zanoni <paulo.r.zanoni at intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 8 ++++++++
>  drivers/gpu/drm/i915/intel_display.c    | 4 +++-
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 9 +++++++--
>  3 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5706b8a..5acda40 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -657,6 +657,14 @@ struct i915_fbc {
>
>         bool false_color;
>
> +       /* On gen8 some rings cannont perform fbc clean operation so for now
> +        * we are doing this on SW with mmio.
> +        * This variable works in the opposite information direction
> +        * of ring->fbc_dirty telling software on frontbuffer tracking
> +        * to perform the cache clean on sw side.
> +        */
> +       bool need_sw_cache_clean;


Ok, this is my first time trying to look at the buffer tracking code,
so I have some noob questions.

It seems need_sw_cache_clean is set every time we call
gen6_ring_flush(), just like gen7_ring_fbc_flush(). But the thing
about gen7_ring_fbc_flush() is that it becomes a noop when fbc_dirty
is false. So maybe on this code we're doing the sw cache clean every
time we do a BLT operation, even if this operation is not on the front
buffer? That would be an unnecessary overkill, right? Continuing the
same though, why can't we just reuse the ring->fbc_dirty variable
instead of using this new need_sw_cache_clean? Maybe we should only do
the sw cache clean when both fbc_dirty *and* need_sw_cache are true?

More below.


> +
>         struct intel_fbc_work {
>                 struct delayed_work work;
>                 struct drm_crtc *crtc;
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 965eb3c..731d925 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9137,8 +9137,10 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>          * needs to be reworked into a proper frontbuffer tracking scheme like
>          * psr employs.
>          */
> -       if (IS_BROADWELL(dev))
> +       if (IS_BROADWELL(dev) && dev_priv->fbc.need_sw_cache_clean) {
> +               dev_priv->fbc.need_sw_cache_clean = false;
>                 gen8_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> +       }


Bikeshed: since this is the only caller, I think it makes sense to put
the checks inside the function instead of the caller.


>  }
>
>  /**
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 85fc2b1..02b43cd 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2227,6 +2227,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,


The thing about gen6_render_ring_flush is that it is also a function
for the vebox ring. Is this a problem? What are the consequences of
doing the FBC thing on it?


>                            u32 invalidate, u32 flush)
>  {
>         struct drm_device *dev = ring->dev;
> +       struct drm_i915_private *dev_priv = dev->dev_private;
>         uint32_t cmd;
>         int ret;
>
> @@ -2257,8 +2258,12 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>         }
>         intel_ring_advance(ring);
>
> -       if (IS_GEN7(dev) && !invalidate && flush)
> -               return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +       if (!invalidate && flush) {
> +               if (IS_GEN7(dev))
> +                       return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> +               else if (IS_GEN8(dev))
> +                       dev_priv->fbc.need_sw_cache_clean = true;


Bikeshed: since I like to keep the callers clean, can't we just create
a new intel_blt_fbc_flush (or intel_ring_fbc_flush(), or whatever)
that will be responsible for deciding if it needs to call
gen7_ring_fbc_flush or set need_sw_cache_clean to true? That would
make even more sense if we conclude we also need to check for
fbc_dirty before setting need_sw_cache_clean to true...


> +       }
>
>         return 0;
>  }
> --
> 1.9.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Paulo Zanoni



More information about the Intel-gfx mailing list