[Mesa-dev] [PATCH 12/17] mesa/st: make sure the const buffer is large enough for packed uniforms

Mon Jun 26 10:27:27 UTC 2017

On 26/06/17 20:06, Nicolai Hähnle wrote:

> On 26.06.2017 11:56, Timothy Arceri wrote:
>> On 26/06/17 19:27, Nicolai Hähnle wrote:
>>
>>> On 25.06.2017 03:31, Timothy Arceri wrote:
>>>> ---
>>>>   src/mesa/state_tracker/st_atom_constbuf.c | 8 +++++++-
>>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/src/mesa/state_tracker/st_atom_constbuf.c 
>>>> b/src/mesa/state_tracker/st_atom_constbuf.c
>>>> index 7f17546..eb8f6b3 100644
>>>> --- a/src/mesa/state_tracker/st_atom_constbuf.c
>>>> +++ b/src/mesa/state_tracker/st_atom_constbuf.c
>>>> @@ -83,21 +83,27 @@ void st_upload_constants(struct st_context *st, 
>>>> struct gl_program *prog)
>>>>        /* Make all bindless samplers/images bound texture/image 
>>>> units resident in
>>>>       * the context.
>>>>       */
>>>>      st_make_bound_samplers_resident(st, prog);
>>>>      st_make_bound_images_resident(st, prog);
>>>>        /* update constants */
>>>>      if (params && params->NumParameters) {
>>>>         struct pipe_constant_buffer cb;
>>>> -      const uint paramBytes = params->NumParameters * 
>>>> sizeof(GLfloat) * 4;
>>>> +
>>>> +      /* We need to align to 4 as the driver will expect to load 
>>>> four values
>>>> +       * regardless of how many values the last param has. It's 
>>>> safe to do the
>>>> +       * align as the param list always allocates extra memory.
>>>> +       */
>>>> +      const uint paramBytes =
>>>> +         align((params->NumParameterValues * sizeof(GLfloat)) + 1, 
>>>> 4);
>>>
>>> This looks broken.
>>>
>>> First of all, I would say that it shouldn't actually be required. A 
>>> driver that supports packed uniforms should only load as much as the 
>>> shader actually requests.
>>
>> I recall you telling me on IRC that radeonsi needs to load vec4 
>> chunks for efficiency. Maybe I misinterpreted what you were saying?
>
> Right, we should try to merge loads in the backend, because loading a 
> vec4 (or even up to 16 consecutive constant!) in a single instruction 
> is faster than 4 or 16 loads of a single constant, assuming register 
> pressure isn't an issue. FWIW, we're not actually doing that right now.
>
> However, there's no reason for a vec4 load when you only need one or 
> two constants, as would be the case when loading from the end of the 
> constant buffer. So there's no reason to pad the buffer.

Ok, thanks for the clarification. I think I can see how this would work 
currently, you would need a swizzle xxxx to handle a single component 
passed the last block of 4.

>
> Cheers,
> Nicolai
>
>
>>>
>>> Second, even if it were needed, the code doesn't do what it says.
>>
>> :P I made some *simplifications* here at some point, not sure what I 
>> was thinking at the time.
>>
>>>
>>> Cheers,
>>> Nicolai
>>>
>>>
>>>>           /* Update the constants which come from fixed-function 
>>>> state, such as
>>>>          * transformation matrices, fog factors, etc.  The rest of 
>>>> the values in
>>>>          * the parameters list are explicitly set by the user with 
>>>> glUniform,
>>>>          * glProgramParameter(), etc.
>>>>          */
>>>>         if (params->StateFlags)
>>>>            _mesa_load_state_parameters(st->ctx, params);
>>>>           _mesa_shader_write_subroutine_indices(st->ctx, stage);
>>>>
>>>
>>>
>>
>
>