[Mesa-dev] [PATCH v2] i965/gen8/cs: Gen8 requires 64 byte alignment for push constant data
Lofstedt, Marta
marta.lofstedt at intel.com
Wed Dec 16 01:34:51 PST 2015
Reviewed-by: Marta Lofstedt <marta.lofstedt at intel.com>
> -----Original Message-----
> From: Iago Toral Quiroga [mailto:itoral at igalia.com]
> Sent: Wednesday, December 16, 2015 10:02 AM
> To: mesa-dev at lists.freedesktop.org
> Cc: Lofstedt, Marta; Palli, Tapani; Justen, Jordan L; Iago Toral Quiroga
> Subject: [PATCH v2] i965/gen8/cs: Gen8 requires 64 byte alignment for push
> constant data
>
> The BDW PRM Vol2a: Command Reference: Instructions, section
> MEDIA_CURBE_LOAD, says that 'CURBE Total Data Length' and 'CURBE Data
> Start Address' are 64-byte aligned. This is different from previous gens, that
> were 32-byte aligned.
>
> v2 (Jordan):
> - CURBE Data Start Address is also 64-byte aligned.
> - The call to brw_state_batch should also use 64-byte alignment.
> - Improve PRM reference.
>
> Fixes the following SSBO CTS tests on BDW:
> ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
> ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-
> cs
> ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
> ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-
> case2-cs
> ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs
>
> And many other CS CTS tests as reported by Marta Lofstedt.
> ---
> src/mesa/drivers/dri/i965/gen7_cs_state.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 1fde69c..df0f301 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -68,7 +68,7 @@ brw_upload_cs_state(struct brw_context *brw)
>
> uint32_t *bind = (uint32_t*) brw_state_batch(brw,
> AUB_TRACE_BINDING_TABLE,
> prog_data->binding_table.size_bytes,
> - 32, &stage_state->bind_bo_offset);
> + 64,
> + &stage_state->bind_bo_offset);
>
> unsigned local_id_dwords = 0;
>
> @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw)
>
> unsigned push_constant_data_size =
> (prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value);
> - unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size,
> 32);
> + unsigned reg_aligned_constant_size =
> + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
> unsigned push_constant_regs = reg_aligned_constant_size / 32;
> unsigned threads = get_cs_thread_count(cs_prog_data);
>
> @@ -138,11 +139,13 @@ brw_upload_cs_state(struct brw_context *brw)
> ADVANCE_BATCH();
>
> if (reg_aligned_constant_size > 0) {
> + const unsigned aligned_push_const_offset =
> + ALIGN(stage_state->push_const_offset, brw->gen < 8 ? 32 : 64);
> BEGIN_BATCH(4);
> OUT_BATCH(MEDIA_CURBE_LOAD << 16 | (4 - 2));
> OUT_BATCH(0);
> OUT_BATCH(reg_aligned_constant_size * threads);
> - OUT_BATCH(stage_state->push_const_offset);
> + OUT_BATCH(aligned_push_const_offset);
> ADVANCE_BATCH();
> }
>
> @@ -241,7 +244,8 @@ brw_upload_cs_push_constants(struct brw_context
> *brw,
>
> const unsigned push_constant_data_size =
> (local_id_dwords + prog_data->nr_params) *
> sizeof(gl_constant_value);
> - const unsigned reg_aligned_constant_size =
> ALIGN(push_constant_data_size, 32);
> + const unsigned reg_aligned_constant_size =
> + ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
> const unsigned param_aligned_count =
> reg_aligned_constant_size / sizeof(*param);
>
> --
> 1.9.1
More information about the mesa-dev
mailing list