[Mesa-dev] [PATCH v2] i965/gen8/cs: Gen8 requires 64 byte alignment for push constant data

Jordan Justen jordan.l.justen at intel.com
Wed Dec 16 14:48:30 PST 2015


On 2015-12-16 11:39:00, Kenneth Graunke wrote:
> On Wednesday, December 16, 2015 10:02:16 AM Iago Toral Quiroga wrote:
> > The BDW PRM Vol2a: Command Reference: Instructions, section MEDIA_CURBE_LOAD,
> > says that 'CURBE Total Data Length' and 'CURBE Data Start Address' are
> > 64-byte aligned. This is different from previous gens, that were 32-byte
> > aligned.
> > 
> > v2 (Jordan):
> >   - CURBE Data Start Address is also 64-byte aligned.
> >   - The call to brw_state_batch should also use 64-byte alignment.
> >   - Improve PRM reference.
> > 
> > Fixes the following SSBO CTS tests on BDW:
> > ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
> > ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
> > ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
> > ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
> > ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
> > ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs
> > ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs
> > 
> > And many other CS CTS tests as reported by Marta Lofstedt.
> > ---
> >  src/mesa/drivers/dri/i965/gen7_cs_state.c | 12 ++++++++----
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> > index 1fde69c..df0f301 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> > @@ -68,7 +68,7 @@ brw_upload_cs_state(struct brw_context *brw)
> >  
> >     uint32_t *bind = (uint32_t*) brw_state_batch(brw, AUB_TRACE_BINDING_TABLE,
> >                                              prog_data->binding_table.size_bytes,
> > -                                            32, &stage_state->bind_bo_offset);
> > +                                            64, &stage_state->bind_bo_offset);
> 
> I don't understand this hunk - binding tables don't have anything to do
> with push constants.  These are for pull constants and UBOs.  At least
> in the 3D pipeline, we only align these to 32B, not 64.

Yeah. I think he wants to update the call you pointed out below in
brw_upload_cs_push_constants.

Also, how about consistently applying the alignment change? Either,
just bump the base and size alignment to 64, or also check the gen to
align the base to 32 on gen7.

How about the attached patch?

-Jordan

> >     unsigned local_id_dwords = 0;
> >  
> > @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw)
> >  
> >     unsigned push_constant_data_size =
> >        (prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value);
> > -   unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32);
> > +   unsigned reg_aligned_constant_size =
> > +      ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
> >     unsigned push_constant_regs = reg_aligned_constant_size / 32;
> >     unsigned threads = get_cs_thread_count(cs_prog_data);
> >  
> > @@ -138,11 +139,13 @@ brw_upload_cs_state(struct brw_context *brw)
> >     ADVANCE_BATCH();
> >  
> >     if (reg_aligned_constant_size > 0) {
> > +      const unsigned aligned_push_const_offset =
> > +         ALIGN(stage_state->push_const_offset, brw->gen < 8 ? 32 : 64);
> 
> This is wrong.  What you want is to change:
> 
>       param = (gl_constant_value*)
>          brw_state_batch(brw, type,
>                          reg_aligned_constant_size * threads,
>                          32, &stage_state->push_const_offset);
> 
> to use an alignment of 64 instead of 32 on Gen8+.  That way, it'll
> actually upload the data to a portion of the buffer that starts on
> a 64B aligned boundary.
> 
> As is, you're uploading the data to a 32B aligned section and then
> just fudging the pointer to be 64B aligned, possibly skipping over
> the first 32B.  Probably not what you wanted :)
> 
> Maybe you accidentally changed the wrong brw_state_batch call?
> 
> >        BEGIN_BATCH(4);
> >        OUT_BATCH(MEDIA_CURBE_LOAD << 16 | (4 - 2));
> >        OUT_BATCH(0);
> >        OUT_BATCH(reg_aligned_constant_size * threads);
> > -      OUT_BATCH(stage_state->push_const_offset);
> > +      OUT_BATCH(aligned_push_const_offset);
> >        ADVANCE_BATCH();
> >     }
> >  
> > @@ -241,7 +244,8 @@ brw_upload_cs_push_constants(struct brw_context *brw,
> >  
> >        const unsigned push_constant_data_size =
> >           (local_id_dwords + prog_data->nr_params) * sizeof(gl_constant_value);
> > -      const unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32);
> > +      const unsigned reg_aligned_constant_size =
> > +         ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
> >        const unsigned param_aligned_count =
> >           reg_aligned_constant_size / sizeof(*param);
> >  
> > 
> 
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: align-64.patch
Type: text/x-diff
Size: 1091 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151216/37273eef/attachment.patch>


More information about the mesa-dev mailing list