[Mesa-stable] [PATCH] i965/gen8+: Do full stall when switching pipeline

Pohjolainen, Topi topi.pohjolainen at gmail.com
Sun Mar 19 09:01:58 UTC 2017


On Sun, Mar 19, 2017 at 10:45:59AM +0200, Pohjolainen, Topi wrote:
> On Thu, Mar 16, 2017 at 01:02:00PM -0700, Francisco Jerez wrote:
> > Jason Ekstrand <jason at jlekstrand.net> writes:
> > 
> > > Thanks for tracking this down!
> > >
> > > Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
> > >
> > > I'm not sure how we missed this when CS was brought up.  Oh well.
> > >
> > 
> > I don't think we did.  The hardware docs are somewhat contradictory
> > about whether this is required:
> > 
> > | Render CS Only: SW must always program PIPE_CONTROL with CS Stall and
> > | Render Target Cache Flush Enable set prior to programming
> > | PIPELINE_SELECT command for GPGPU workloads i.e when pipeline mode is
> > | set to GPGPU. This is required to achieve better GPGPU preemption
> > | latencies for certain programming sequences. If programming
> > | PIPE_CONTROL has performance implications then preemption latencies
> > | can be trade off against performance by not implementing this
> > | programming note.
> > 
> > This sounds like more of a suggestion (which potentially trades off
> > throughput for GPGPU latency due to the synchronous execution of the
> > compute shader) than a requirement, so the stall may not be required and
> > is likely to impact performance.
> > 
> > I wonder if the following workaround would be as effective fixing the
> > bug linked below:
> > 
> >  - Speculatively flush the render and depth caches with a pipelined
> >    PIPE_CONTROL right before PIPELINE_SELECT (note that this still
> >    allows some parallelism between 3D and GPGPU workloads).
> > 
> >  - On brw_render_cache_set_check_flush(), if the GPGPU pipeline is
> >    selected, emit an HDC flush+CS stall instead of the depth/render
> >    target flush+CS stall.
> 
> This looked to be sufficient also on my SKL at least, note that this leaves
> the CS stall bit out from compute tex cache invalidate.
> 
> While I'm comfortable with the fix lining gen8+ with earlier gens (which is
> according to the spec), this here looks more an optimization which should be
> applied to all gens. It also requires more in-depth understanding of the
> pipeline switch (which I don't think I have). 
> 
> I have pushed my patch already before I actually saw your response. This here
> in my opinion requires some sort of comment telling why we did things
> differently then described in the spec. I'd be happier if you followed up on
> this instead of me.
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c
> index 84f0c18..9704a9f 100644
> --- a/src/mesa/drivers/dri/i965/brw_misc_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
> @@ -844,6 +844,13 @@ brw_emit_select_pipeline(struct brw_context *brw, enum brw_pipeline pipeline)
>           brw->ctx.NewDriverState |= BRW_NEW_CC_STATE;
>        }
>  
> +      const unsigned dc_flush = brw->last_pipeline == BRW_COMPUTE_PIPELINE ?
> +                                PIPE_CONTROL_DATA_CACHE_FLUSH :
> +                                PIPE_CONTROL_RENDER_TARGET_FLUSH |
> +                                PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> +
> +      brw_emit_pipe_control_flush(brw, dc_flush | PIPE_CONTROL_CS_STALL);
> +
>     } else if (brw->gen >= 6) {
>        /* From "BXML <C2><BB> GT <C2><BB> MI <C2><BB> vol1a GPU Overview <C2>
> <BB> [Instruction]
>         * PIPELINE_SELECT [DevBWR+]":
> diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c b/src/mesa/drivers/dri/i965/intel_fbo.c
> index de0cd6a..3126be6 100644
> --- a/src/mesa/drivers/dri/i965/intel_fbo.c
> +++ b/src/mesa/drivers/dri/i965/intel_fbo.c
> @@ -1082,10 +1082,16 @@ brw_render_cache_set_check_flush(struct brw_context *brw, drm_intel_bo *bo)
>        return;
>  
>     if (brw->gen >= 6) {
> -      brw_emit_pipe_control_flush(brw,
> -                                  PIPE_CONTROL_DEPTH_CACHE_FLUSH |
> -                                  PIPE_CONTROL_RENDER_TARGET_FLUSH |
> -                                  PIPE_CONTROL_CS_STALL);
> +      if (brw->last_pipeline == BRW_COMPUTE_PIPELINE) {
> +         brw_emit_pipe_control_flush(brw,
> +                                     PIPE_CONTROL_DATA_CACHE_FLUSH |
> +                                     PIPE_CONTROL_CS_STALL);
> +      } else {
> +         brw_emit_pipe_control_flush(brw,
> +                                     PIPE_CONTROL_DEPTH_CACHE_FLUSH |
> +                                     PIPE_CONTROL_RENDER_TARGET_FLUSH |
> +                                     PIPE_CONTROL_CS_STALL);
> +      }

Flushing depth and rt in compute is of course wrong and this bit is genuine
fix. I'll write a patch for it. I'll just need to study a little what the
render cache means in compute context - initially I feel we shouldn't actually
regard that at all in compute.

>  
>        brw_emit_pipe_control_flush(brw,
>                                    PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
> 
> > 
> > > On Thu, Mar 16, 2017 at 1:24 AM, Topi Pohjolainen <
> > > topi.pohjolainen at gmail.com> wrote:
> > >
> > >> just as earlier gens do.
> > >>
> > >> CC:  Jason Ekstrand <jason at jlekstrand.net>
> > >> Cc: "17.0 13.0" <mesa-stable at lists.freedesktop.org>
> > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96743
> > >>
> > >> Signed-off-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
> > >> ---
> > >>  src/mesa/drivers/dri/i965/brw_misc_state.c | 3 ++-
> > >>  1 file changed, 2 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c
> > >> b/src/mesa/drivers/dri/i965/brw_misc_state.c
> > >> index 84f0c18..1cf6b04 100644
> > >> --- a/src/mesa/drivers/dri/i965/brw_misc_state.c
> > >> +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
> > >> @@ -843,8 +843,9 @@ brw_emit_select_pipeline(struct brw_context *brw,
> > >> enum brw_pipeline pipeline)
> > >>
> > >>           brw->ctx.NewDriverState |= BRW_NEW_CC_STATE;
> > >>        }
> > >> +   }
> > >>
> > >> -   } else if (brw->gen >= 6) {
> > >> +   if (brw->gen >= 6) {
> > >>        /* From "BXML » GT » MI » vol1a GPU Overview » [Instruction]
> > >>         * PIPELINE_SELECT [DevBWR+]":
> > >>         *
> > >> --
> > >> 2.9.3
> > >>
> > >>
> > > _______________________________________________
> > > mesa-stable mailing list
> > > mesa-stable at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-stable
> 
> 
> 


More information about the mesa-stable mailing list