[Mesa-dev] [PATCH 12/17] mesa/st: make sure the const buffer is large enough for packed uniforms

Mon Jun 26 10:06:58 UTC 2017

On 26.06.2017 11:56, Timothy Arceri wrote:
> On 26/06/17 19:27, Nicolai Hähnle wrote:
> 
>> On 25.06.2017 03:31, Timothy Arceri wrote:
>>> ---
>>>   src/mesa/state_tracker/st_atom_constbuf.c | 8 +++++++-
>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/state_tracker/st_atom_constbuf.c 
>>> b/src/mesa/state_tracker/st_atom_constbuf.c
>>> index 7f17546..eb8f6b3 100644
>>> --- a/src/mesa/state_tracker/st_atom_constbuf.c
>>> +++ b/src/mesa/state_tracker/st_atom_constbuf.c
>>> @@ -83,21 +83,27 @@ void st_upload_constants(struct st_context *st, 
>>> struct gl_program *prog)
>>>        /* Make all bindless samplers/images bound texture/image units 
>>> resident in
>>>       * the context.
>>>       */
>>>      st_make_bound_samplers_resident(st, prog);
>>>      st_make_bound_images_resident(st, prog);
>>>        /* update constants */
>>>      if (params && params->NumParameters) {
>>>         struct pipe_constant_buffer cb;
>>> -      const uint paramBytes = params->NumParameters * 
>>> sizeof(GLfloat) * 4;
>>> +
>>> +      /* We need to align to 4 as the driver will expect to load 
>>> four values
>>> +       * regardless of how many values the last param has. It's safe 
>>> to do the
>>> +       * align as the param list always allocates extra memory.
>>> +       */
>>> +      const uint paramBytes =
>>> +         align((params->NumParameterValues * sizeof(GLfloat)) + 1, 4);
>>
>> This looks broken.
>>
>> First of all, I would say that it shouldn't actually be required. A 
>> driver that supports packed uniforms should only load as much as the 
>> shader actually requests.
> 
> I recall you telling me on IRC that radeonsi needs to load vec4 chunks 
> for efficiency. Maybe I misinterpreted what you were saying?

Right, we should try to merge loads in the backend, because loading a 
vec4 (or even up to 16 consecutive constant!) in a single instruction is 
faster than 4 or 16 loads of a single constant, assuming register 
pressure isn't an issue. FWIW, we're not actually doing that right now.

However, there's no reason for a vec4 load when you only need one or two 
constants, as would be the case when loading from the end of the 
constant buffer. So there's no reason to pad the buffer.

Cheers,
Nicolai

>>
>> Second, even if it were needed, the code doesn't do what it says.
> 
> :P I made some *simplifications* here at some point, not sure what I was 
> thinking at the time.
> 
>>
>> Cheers,
>> Nicolai
>>
>>
>>>           /* Update the constants which come from fixed-function 
>>> state, such as
>>>          * transformation matrices, fog factors, etc.  The rest of 
>>> the values in
>>>          * the parameters list are explicitly set by the user with 
>>> glUniform,
>>>          * glProgramParameter(), etc.
>>>          */
>>>         if (params->StateFlags)
>>>            _mesa_load_state_parameters(st->ctx, params);
>>>           _mesa_shader_write_subroutine_indices(st->ctx, stage);
>>>
>>
>>
> 

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.