[Mesa-dev] [PATCH v2 08/42] i965/hsw: Enable L3 atomics.
Francisco Jerez
currojerez at riseup.net
Thu Nov 26 06:44:48 PST 2015
Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> On 18/11/15 06:54, Jordan Justen wrote:
>> From: Francisco Jerez <currojerez at riseup.net>
>>
>> Improves performance of the arb_shader_image_load_store-atomicity
>> piglit test by over 25x (which isn't a real benchmark it's just heavy
>> on atomics -- the improvement in a microbenchmark I wrote a while ago
>> seemed to be even greater). The drawback is one needs to be
>> extra-careful not to hang the GPU (in fact the whole system). A DC
>> partition must have been allocated on L3, the "convert L3 cycle for DC
>> to UC" bit may not be set, the MOCS L3 cacheability bit must be set
>> for all surfaces accessed using DC atomics, and the SCRATCH1 and
>> ROW_CHICKEN3 bits must be kept in sync.
>>
>> A fairly recent kernel is required for the command parser to allow
>> writes to these registers.
>> ---
>> src/mesa/drivers/dri/i965/gen7_l3_state.c | 14 ++++++++++++++
>> 1 file changed, 14 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> index 48bca29..c863b7f 100644
>> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> @@ -254,5 +254,19 @@ setup_l3_config(struct brw_context *brw, const struct brw_l3_config *cfg)
>> SET_FIELD(cfg->n[L3P_T], GEN7_L3CNTLREG3_T_ALLOC));
>>
>> ADVANCE_BATCH();
>> +
>> + if (brw->is_haswell && brw->intelScreen->cmd_parser_version >= 4) {
>> + /* Enable L3 atomics on HSW if we have a DC partition, otherwise keep
>> + * them disabled to avoid crashing the system hard.
>> + */
>> + BEGIN_BATCH(5);
>> + OUT_BATCH(MI_LOAD_REGISTER_IMM | (5 - 2));
>> + OUT_BATCH(HSW_SCRATCH1);
>> + OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE);
>> + OUT_BATCH(HSW_ROW_CHICKEN3);
>> + OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 |
>> + (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));
>
>
> I have not found references to ROW_CHICKEN3 nor register with 0xe49c
> address offset in HSW PRMs, so these could be stupid questions:
>
Yeah, these chicken registers are not part of the public PRMs.
> Why you need to set the L3 atomic disable flag in two different places
> in ROW_CHICKEN3 register? Also, why the first flag is set
> unconditionally while the second one only if we don't have a DC
> partition? This is what you want?
It's a masked register as many other MMIO registers on Gen hardware: the
upper 16 bits control which of the lower 16 bits are modified in the
register.
>
>
> Also, if the "HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16" is really
> needed, it could be defined as a constant in the first patch of the series.
>
If you think that would be more readable I guess we could define a
REG_MASK() macro and use it wherever we write a masked register. See
attachment (applies on top of this patch).
> Sam
>
>> + ADVANCE_BATCH();
>> + }
>> }
>> }
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-i965-Define-and-use-REG_MASK-macro-to-make-masked-MM.patch
Type: text/x-diff
Size: 3104 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151126/7bb4d037/attachment.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151126/7bb4d037/attachment.sig>
More information about the mesa-dev
mailing list