[Mesa-dev] [PATCH v2 08/42] i965/hsw: Enable L3 atomics.
Samuel Iglesias Gonsálvez
siglesias at igalia.com
Wed Nov 25 23:37:25 PST 2015
On 18/11/15 06:54, Jordan Justen wrote:
> From: Francisco Jerez <currojerez at riseup.net>
>
> Improves performance of the arb_shader_image_load_store-atomicity
> piglit test by over 25x (which isn't a real benchmark it's just heavy
> on atomics -- the improvement in a microbenchmark I wrote a while ago
> seemed to be even greater). The drawback is one needs to be
> extra-careful not to hang the GPU (in fact the whole system). A DC
> partition must have been allocated on L3, the "convert L3 cycle for DC
> to UC" bit may not be set, the MOCS L3 cacheability bit must be set
> for all surfaces accessed using DC atomics, and the SCRATCH1 and
> ROW_CHICKEN3 bits must be kept in sync.
>
> A fairly recent kernel is required for the command parser to allow
> writes to these registers.
> ---
> src/mesa/drivers/dri/i965/gen7_l3_state.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> index 48bca29..c863b7f 100644
> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
> @@ -254,5 +254,19 @@ setup_l3_config(struct brw_context *brw, const struct brw_l3_config *cfg)
> SET_FIELD(cfg->n[L3P_T], GEN7_L3CNTLREG3_T_ALLOC));
>
> ADVANCE_BATCH();
> +
> + if (brw->is_haswell && brw->intelScreen->cmd_parser_version >= 4) {
> + /* Enable L3 atomics on HSW if we have a DC partition, otherwise keep
> + * them disabled to avoid crashing the system hard.
> + */
> + BEGIN_BATCH(5);
> + OUT_BATCH(MI_LOAD_REGISTER_IMM | (5 - 2));
> + OUT_BATCH(HSW_SCRATCH1);
> + OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE);
> + OUT_BATCH(HSW_ROW_CHICKEN3);
> + OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 |
> + (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));
I have not found references to ROW_CHICKEN3 nor register with 0xe49c
address offset in HSW PRMs, so these could be stupid questions:
Why you need to set the L3 atomic disable flag in two different places
in ROW_CHICKEN3 register? Also, why the first flag is set
unconditionally while the second one only if we don't have a DC
partition? This is what you want?
Also, if the "HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16" is really
needed, it could be defined as a constant in the first patch of the series.
Sam
> + ADVANCE_BATCH();
> + }
> }
> }
>
More information about the mesa-dev
mailing list