[Intel-gfx] [PATCH] drm/i915: WA: FBC Render Nuke.

Rodrigo Vivi rodrigo.vivi at gmail.com
Mon Jun 3 18:50:46 CEST 2013


On Mon, Jun 3, 2013 at 8:34 AM, Ville Syrjälä
<ville.syrjala at linux.intel.com> wrote:
> On Fri, May 31, 2013 at 05:15:41PM -0300, Rodrigo Vivi wrote:
>> Hi Ville,
>>
>> Thanks for the comments.
>>
>>
>> On Fri, May 31, 2013 at 12:59 PM, Ville Syrjälä
>> <ville.syrjala at linux.intel.com> wrote:
>> > On Tue, May 28, 2013 at 09:25:12PM -0300, Rodrigo Vivi wrote:
>> >> WaFbcNukeOn3DBlt for IVB, HSW and VLV.
>> >
>> > VLV doesn't have FBC, so this is a bit incorrect.
>>
>> I'm going to remove the vlv mention that incorrectly came from spec...
>>
>> >
>> >>
>> >> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC.
>> >> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that
>> >> follows each render submission."
>> >>
>> >> v2: Chris noticed that flush_domains check was missing here and also suggested to do
>> >>     LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the
>> >>     module parameter check.
>> >>
>> >> v3: Adding Wa name as Damien suggested.
>> >>
>> >> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at gmail.com>
>> >> ---
>> >>  drivers/gpu/drm/i915/i915_reg.h         |  2 ++
>> >>  drivers/gpu/drm/i915/intel_pm.c         |  2 +-
>> >>  drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++
>> >>  3 files changed, 35 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> >> index cc4c223..81ac584 100644
>> >> --- a/drivers/gpu/drm/i915/i915_reg.h
>> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> >> @@ -977,6 +977,8 @@
>> >>  /* Framebuffer compression for Ivybridge */
>> >>  #define IVB_FBC_RT_BASE                      0x7020
>> >>
>> >> +#define MSG_FBC_REND_STATE   0x50380
>> >> +#define   FBC_REND_NUKE              (1<<2)
>> >>
>> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_A  0x420B0
>> >>  #define _HSW_PIPE_SLICE_CHICKEN_1_B  0x420B4
>> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> >> index 1879188..e830a9b 100644
>> >> --- a/drivers/gpu/drm/i915/intel_pm.c
>> >> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> >> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>> >>       struct drm_i915_gem_object *obj = intel_fb->obj;
>> >>       struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> >>
>> >> -     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
>> >> +     I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
>> >>
>> >>       if (!intel_edp_is_psr_enabled(dev))
>> >>               I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
>> >> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> >> index 3d2c236..69491db 100644
>> >> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> >> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> >> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring)
>> >>       return 0;
>> >>  }
>> >>
>> >> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring)
>> >> +{
>> >> +     struct drm_device *dev = ring->dev;
>> >> +     int ret;
>> >> +
>> >> +     if (i915_enable_fbc == 0)
>> >> +             return 0;
>> >> +
>> >> +     if (i915_enable_fbc < 0 && !IS_HASWELL(dev))
>> >> +             return 0;
>> >> +
>> >> +     ret = intel_ring_begin(ring, 4);
>> >> +     if (ret)
>> >> +             return ret;
>> >> +     intel_ring_emit(ring, MI_NOOP);
>> >> +     /* WaFbcNukeOn3DBlt:ivb/hsw/vlv */
>> >
>> > Another mention of vlv. I can see BSpec makes the same mistake in
>> > the register description though.
>>
>> ... as you noticed.
>>
>> >
>> >> +     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> >> +     intel_ring_emit(ring, MSG_FBC_REND_STATE);
>> >> +     intel_ring_emit(ring, FBC_REND_NUKE);
>> >> +     intel_ring_advance(ring);
>> >> +
>> >> +     return 0;
>> >> +}
>> >> +
>> >>  static int
>> >>  gen7_render_ring_flush(struct intel_ring_buffer *ring,
>> >>                      u32 invalidate_domains, u32 flush_domains)
>> >> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring,
>> >>       intel_ring_emit(ring, 0);
>> >>       intel_ring_advance(ring);
>> >>
>> >> +     if (flush_domains)
>> >> +             return gen7_ring_fbc_flush(ring);
>> >> +
>> >>       return 0;
>> >>  }
>> >>
>> >> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring,
>> >>  static int blt_ring_flush(struct intel_ring_buffer *ring,
>> >>                         u32 invalidate, u32 flush)
>> >>  {
>> >> +     struct drm_device *dev = ring->dev;
>> >>       uint32_t cmd;
>> >>       int ret;
>> >>
>> >> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring,
>> >>       intel_ring_emit(ring, 0);
>> >>       intel_ring_emit(ring, MI_NOOP);
>> >>       intel_ring_advance(ring);
>> >> +
>> >> +     if (IS_GEN7(dev))
>> >> +             return gen7_ring_fbc_flush(ring);
>> >
>> > Should check flush_domains here as well?
>>
>> here is flush domain by definition, isn' t it?
>
> How so?

this function is the ring->flush. how is it possible to have it out of
flush domain? or are the names just confusing me?

>
>>
>> >
>> > So we're now using the same nuke mechanism from the blt ring too.
>> > Should we then drop the regular blitter tracking things from fbc_enable?
>>
>> This is a good question. Since this is a critical patch and it is
>> working in the way it is I prefer to let it in the way it is and
>> promisse that I will try to drop old blitter tracking for ivb and hsw
>> later. If it works I'll send the drop in another patch.
>>
>> >
>> > Oh and what about vcs and vecs, should we nuke from those rings as well?
>> > I guess it would be strange to write to the primary plane's buffer via
>> > vcs, but I'm assuming vebox could write the same formats that we can
>> > scan out...
>>
>> To be truly honest with you I have no idea about these case. specs
>> just says to put after every pipe_control following flush
>> renderings... and blt.
>
> IIRC the spec doesn't say anything about blt.
>
> Ah, the PM giude tells you to do LRIs w/ blt too. But it actually says
> that you should do "cache clean" LRIs insted of "nuke" LRIs.

It is just the name, but same address and bit.
from BSPec: " Driver must program a MI_FLUSH_DW followed by a LRI into
the BCS ring to generate a cache clean message to FBC (LRI to offset
0x50380 with data 0x00000002)."

With this blit in the way it is I' m getting best rendering and power
saving performance.

>
>>
>> >
>> >> +
>> >>       return 0;
>> >>  }
>> >>
>> >> --
>> >> 1.8.1.4
>> >>
>> >> _______________________________________________
>> >> Intel-gfx mailing list
>> >> Intel-gfx at lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>> >
>> > --
>> > Ville Syrjälä
>> > Intel OTC
>>
>>
>>
>> --
>> Rodrigo Vivi
>> Blog: http://blog.vivi.eng.br
>
> --
> Ville Syrjälä
> Intel OTC



--
Rodrigo Vivi
Blog: http://blog.vivi.eng.br



More information about the Intel-gfx mailing list