[Mesa-dev] [PATCH v2] i965/gen8/cs: Gen8 requires 64 byte alignment for push constant data

Iago Toral itoral at igalia.com
Thu Dec 17 00:23:34 PST 2015


On Wed, 2015-12-16 at 14:48 -0800, Jordan Justen wrote:
> On 2015-12-16 11:39:00, Kenneth Graunke wrote:
> > On Wednesday, December 16, 2015 10:02:16 AM Iago Toral Quiroga wrote:
> > > The BDW PRM Vol2a: Command Reference: Instructions, section MEDIA_CURBE_LOAD,
> > > says that 'CURBE Total Data Length' and 'CURBE Data Start Address' are
> > > 64-byte aligned. This is different from previous gens, that were 32-byte
> > > aligned.
> > > 
> > > v2 (Jordan):
> > >   - CURBE Data Start Address is also 64-byte aligned.
> > >   - The call to brw_state_batch should also use 64-byte alignment.
> > >   - Improve PRM reference.
> > > 
> > > Fixes the following SSBO CTS tests on BDW:
> > > ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs
> > > ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs
> > > ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs
> > > ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
> > > ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs
> > > ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs
> > > ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs
> > > 
> > > And many other CS CTS tests as reported by Marta Lofstedt.
> > > ---
> > >  src/mesa/drivers/dri/i965/gen7_cs_state.c | 12 ++++++++----
> > >  1 file changed, 8 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> > > index 1fde69c..df0f301 100644
> > > --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
> > > +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
> > > @@ -68,7 +68,7 @@ brw_upload_cs_state(struct brw_context *brw)
> > >  
> > >     uint32_t *bind = (uint32_t*) brw_state_batch(brw, AUB_TRACE_BINDING_TABLE,
> > >                                              prog_data->binding_table.size_bytes,
> > > -                                            32, &stage_state->bind_bo_offset);
> > > +                                            64, &stage_state->bind_bo_offset);
> > 
> > I don't understand this hunk - binding tables don't have anything to do
> > with push constants.  These are for pull constants and UBOs.  At least
> > in the 3D pipeline, we only align these to 32B, not 64.
> 
> Yeah. I think he wants to update the call you pointed out below in
> brw_upload_cs_push_constants.
> 
> Also, how about consistently applying the alignment change? Either,
> just bump the base and size alignment to 64, or also check the gen to
> align the base to 32 on gen7.
> 
> How about the attached patch?

Yeah, that looks simpler. The patch is:

Tested-by: Iago Toral Quiroga <itoral at igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

Thanks Jordan!

> -Jordan
> 
> > >     unsigned local_id_dwords = 0;
> > >  
> > > @@ -77,7 +77,8 @@ brw_upload_cs_state(struct brw_context *brw)
> > >  
> > >     unsigned push_constant_data_size =
> > >        (prog_data->nr_params + local_id_dwords) * sizeof(gl_constant_value);
> > > -   unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32);
> > > +   unsigned reg_aligned_constant_size =
> > > +      ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
> > >     unsigned push_constant_regs = reg_aligned_constant_size / 32;
> > >     unsigned threads = get_cs_thread_count(cs_prog_data);
> > >  
> > > @@ -138,11 +139,13 @@ brw_upload_cs_state(struct brw_context *brw)
> > >     ADVANCE_BATCH();
> > >  
> > >     if (reg_aligned_constant_size > 0) {
> > > +      const unsigned aligned_push_const_offset =
> > > +         ALIGN(stage_state->push_const_offset, brw->gen < 8 ? 32 : 64);
> > 
> > This is wrong.  What you want is to change:
> > 
> >       param = (gl_constant_value*)
> >          brw_state_batch(brw, type,
> >                          reg_aligned_constant_size * threads,
> >                          32, &stage_state->push_const_offset);
> > 
> > to use an alignment of 64 instead of 32 on Gen8+.  That way, it'll
> > actually upload the data to a portion of the buffer that starts on
> > a 64B aligned boundary.
> > 
> > As is, you're uploading the data to a 32B aligned section and then
> > just fudging the pointer to be 64B aligned, possibly skipping over
> > the first 32B.  Probably not what you wanted :)
> > 
> > Maybe you accidentally changed the wrong brw_state_batch call?
> > 
> > >        BEGIN_BATCH(4);
> > >        OUT_BATCH(MEDIA_CURBE_LOAD << 16 | (4 - 2));
> > >        OUT_BATCH(0);
> > >        OUT_BATCH(reg_aligned_constant_size * threads);
> > > -      OUT_BATCH(stage_state->push_const_offset);
> > > +      OUT_BATCH(aligned_push_const_offset);
> > >        ADVANCE_BATCH();
> > >     }
> > >  
> > > @@ -241,7 +244,8 @@ brw_upload_cs_push_constants(struct brw_context *brw,
> > >  
> > >        const unsigned push_constant_data_size =
> > >           (local_id_dwords + prog_data->nr_params) * sizeof(gl_constant_value);
> > > -      const unsigned reg_aligned_constant_size = ALIGN(push_constant_data_size, 32);
> > > +      const unsigned reg_aligned_constant_size =
> > > +         ALIGN(push_constant_data_size, brw->gen < 8 ? 32 : 64);
> > >        const unsigned param_aligned_count =
> > >           reg_aligned_constant_size / sizeof(*param);
> > >  
> > > 
> > 
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev




More information about the mesa-dev mailing list