[Mesa-dev] [PATCH v2] i965/gen8/cs: Gen8 requires 64 byte alignment for push constant data

Lofstedt, Marta marta.lofstedt at intel.com
Wed Dec 16 01:34:51 PST 2015


Reviewed-by: Marta Lofstedt <marta.lofstedt at intel.com>

> -----Original Message-----
> From: Iago Toral Quiroga [mailto:itoral at igalia.com]
> Sent: Wednesday, December 16, 2015 10:02 AM
> To: mesa-dev at lists.freedesktop.org
> Cc: Lofstedt, Marta; Palli, Tapani; Justen, Jordan L; Iago Toral Quiroga
> Subject: [PATCH v2] i965/gen8/cs: Gen8 requires 64 byte alignment for push
> constant data
> 
> The BDW PRM Vol2a: Command Reference: Instructions, section
> MEDIA_CURBE_LOAD, says that 'CURBE Total Data Length' and 'CURBE Data
> Start Address' are 64-byte aligned. This is different from previous gens, that
> were 32-byte aligned.
> 
> v2 (Jordan):
>   - CURBE Data Start Address is also 64-byte aligned.
>   - The call to brw_state_batch should also use 64-byte alignment.
>   - Improve PRM reference.
> 
> Fixes the following SSBO CTS tests on BDW:
> ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
> ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-
> cs
> ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
> ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-
> case2-cs
> ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs
> 
> And many other CS CTS tests as reported by Marta Lofstedt.
> ---
>  src/mesa/drivers/dri/i965/gen7_cs_state.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 1fde69c..df0f301 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -68,7 +68,7 @@ brw_upload_cs_state(struct brw_context *brw)
> 
>     uint32_t *bind = (uint32_t*) brw_state_batch(brw,
> AUB_TRACE_BINDING_TABLE,
>                                              prog_data->binding_table.size_bytes,
> -                                            32, &stage_state->bind_bo_offset);
> +                                            64,
> + &stage_state->bind_bo_offset);
> 
>     unsigned local_id_dwords = 0;
> 
> @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw)
> 
>     unsigned push_constant_data_size =
>        (prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value);
> -   unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size,
> 32);
> +   unsigned reg_aligned_constant_size =
> +      ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
>     unsigned push_constant_regs = reg_aligned_constant_size / 32;
>     unsigned threads = get_cs_thread_count(cs_prog_data);
> 
> @@ -138,11 +139,13 @@ brw_upload_cs_state(struct brw_context *brw)
>     ADVANCE_BATCH();
> 
>     if (reg_aligned_constant_size > 0) {
> +      const unsigned aligned_push_const_offset =
> +         ALIGN(stage_state->push_const_offset, brw->gen < 8 ? 32 : 64);
>        BEGIN_BATCH(4);
>        OUT_BATCH(MEDIA_CURBE_LOAD << 16 | (4 - 2));
>        OUT_BATCH(0);
>        OUT_BATCH(reg_aligned_constant_size * threads);
> -      OUT_BATCH(stage_state->push_const_offset);
> +      OUT_BATCH(aligned_push_const_offset);
>        ADVANCE_BATCH();
>     }
> 
> @@ -241,7 +244,8 @@ brw_upload_cs_push_constants(struct brw_context
> *brw,
> 
>        const unsigned push_constant_data_size =
>           (local_id_dwords + prog_data->nr_params) *
> sizeof(gl_constant_value);
> -      const unsigned reg_aligned_constant_size =
> ALIGN(push_constant_data_size, 32);
> +      const unsigned reg_aligned_constant_size =
> +         ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
>        const unsigned param_aligned_count =
>           reg_aligned_constant_size / sizeof(*param);
> 
> --
> 1.9.1



More information about the mesa-dev mailing list