[Mesa-dev] [PATCH v2] i965/gen8/cs: Gen8 requires 64 byte alignment for push constant data

Kenneth Graunke kenneth at whitecape.org
Wed Dec 16 11:39:00 PST 2015


On Wednesday, December 16, 2015 10:02:16 AM Iago Toral Quiroga wrote:
> The BDW PRM Vol2a: Command Reference: Instructions, section MEDIA_CURBE_LOAD,
> says that 'CURBE Total Data Length' and 'CURBE Data Start Address' are
> 64-byte aligned. This is different from previous gens, that were 32-byte
> aligned.
> 
> v2 (Jordan):
>   - CURBE Data Start Address is also 64-byte aligned.
>   - The call to brw_state_batch should also use 64-byte alignment.
>   - Improve PRM reference.
> 
> Fixes the following SSBO CTS tests on BDW:
> ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
> ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
> ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
> ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
> ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs
> ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs
> 
> And many other CS CTS tests as reported by Marta Lofstedt.
> ---
>  src/mesa/drivers/dri/i965/gen7_cs_state.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> index 1fde69c..df0f301 100644
> --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> @@ -68,7 +68,7 @@ brw_upload_cs_state(struct brw_context *brw)
>  
>     uint32_t *bind = (uint32_t*) brw_state_batch(brw, AUB_TRACE_BINDING_TABLE,
>                                              prog_data->binding_table.size_bytes,
> -                                            32, &stage_state->bind_bo_offset);
> +                                            64, &stage_state->bind_bo_offset);

I don't understand this hunk - binding tables don't have anything to do
with push constants.  These are for pull constants and UBOs.  At least
in the 3D pipeline, we only align these to 32B, not 64.

>     unsigned local_id_dwords = 0;
>  
> @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw)
>  
>     unsigned push_constant_data_size =
>        (prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value);
> -   unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32);
> +   unsigned reg_aligned_constant_size =
> +      ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
>     unsigned push_constant_regs = reg_aligned_constant_size / 32;
>     unsigned threads = get_cs_thread_count(cs_prog_data);
>  
> @@ -138,11 +139,13 @@ brw_upload_cs_state(struct brw_context *brw)
>     ADVANCE_BATCH();
>  
>     if (reg_aligned_constant_size > 0) {
> +      const unsigned aligned_push_const_offset =
> +         ALIGN(stage_state->push_const_offset, brw->gen < 8 ? 32 : 64);

This is wrong.  What you want is to change:

      param = (gl_constant_value*)
         brw_state_batch(brw, type,
                         reg_aligned_constant_size * threads,
                         32, &stage_state->push_const_offset);

to use an alignment of 64 instead of 32 on Gen8+.  That way, it'll
actually upload the data to a portion of the buffer that starts on
a 64B aligned boundary.

As is, you're uploading the data to a 32B aligned section and then
just fudging the pointer to be 64B aligned, possibly skipping over
the first 32B.  Probably not what you wanted :)

Maybe you accidentally changed the wrong brw_state_batch call?

>        BEGIN_BATCH(4);
>        OUT_BATCH(MEDIA_CURBE_LOAD << 16 | (4 - 2));
>        OUT_BATCH(0);
>        OUT_BATCH(reg_aligned_constant_size * threads);
> -      OUT_BATCH(stage_state->push_const_offset);
> +      OUT_BATCH(aligned_push_const_offset);
>        ADVANCE_BATCH();
>     }
>  
> @@ -241,7 +244,8 @@ brw_upload_cs_push_constants(struct brw_context *brw,
>  
>        const unsigned push_constant_data_size =
>           (local_id_dwords + prog_data->nr_params) * sizeof(gl_constant_value);
> -      const unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32);
> +      const unsigned reg_aligned_constant_size =
> +         ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
>        const unsigned param_aligned_count =
>           reg_aligned_constant_size / sizeof(*param);
>  
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151216/96d401ba/attachment.sig>


More information about the mesa-dev mailing list