[Mesa-dev] [PATCH v2 08/42] i965/hsw: Enable L3 atomics.

Samuel Iglesias Gonsálvez siglesias at igalia.com
Wed Nov 25 23:37:25 PST 2015



On 18/11/15 06:54, Jordan Justen wrote:
> From: Francisco Jerez <currojerez at riseup.net>
> 
> Improves performance of the arb_shader_image_load_store-atomicity
> piglit test by over 25x (which isn't a real benchmark it's just heavy
> on atomics -- the improvement in a microbenchmark I wrote a while ago
> seemed to be even greater).  The drawback is one needs to be
> extra-careful not to hang the GPU (in fact the whole system).  A DC
> partition must have been allocated on L3, the "convert L3 cycle for DC
> to UC" bit may not be set, the MOCS L3 cacheability bit must be set
> for all surfaces accessed using DC atomics, and the SCRATCH1 and
> ROW_CHICKEN3 bits must be kept in sync.
> 
> A fairly recent kernel is required for the command parser to allow
> writes to these registers.
> ---
>  src/mesa/drivers/dri/i965/gen7_l3_state.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> index 48bca29..c863b7f 100644
> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -254,5 +254,19 @@ setup_l3_config(struct brw_context *brw, const struct brw_l3_config *cfg)
>                  SET_FIELD(cfg->n[L3P_T], GEN7_L3CNTLREG3_T_ALLOC));
>  
>        ADVANCE_BATCH();
> +
> +      if (brw->is_haswell && brw->intelScreen->cmd_parser_version >= 4) {
> +         /* Enable L3 atomics on HSW if we have a DC partition, otherwise keep
> +          * them disabled to avoid crashing the system hard.
> +          */
> +         BEGIN_BATCH(5);
> +         OUT_BATCH(MI_LOAD_REGISTER_IMM | (5 - 2));
> +         OUT_BATCH(HSW_SCRATCH1);
> +         OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE);
> +         OUT_BATCH(HSW_ROW_CHICKEN3);
> +         OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 |
> +                   (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));


I have not found references to ROW_CHICKEN3 nor register with 0xe49c
address offset in HSW PRMs, so these could be stupid questions:

Why you need to set the L3 atomic disable flag in two different places
in ROW_CHICKEN3 register? Also, why the first flag is set
unconditionally while the second one only if we don't have a DC
partition? This is what you want?

Also, if the "HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16" is really
needed, it could be defined as a constant in the first patch of the series.

Sam

> +         ADVANCE_BATCH();
> +      }
>     }
>  }
> 




More information about the mesa-dev mailing list