[Mesa-dev] [PATCH] radeonsi: pass alpha_ref value to PS in the user sgpr

Fri Oct 11 01:15:15 CEST 2013

On Thu, Oct 10, 2013 at 6:38 PM, Vadim Girlin <vadimgirlin at gmail.com> wrote:
> On 10/10/2013 08:10 PM, Christian König wrote:
>>
>> Am 10.10.2013 18:02, schrieb Vadim Girlin:
>>>
>>> On 10/10/2013 02:11 PM, Michel Dänzer wrote:
>>>>
>>>> On Don, 2013-10-10 at 12:49 +0400, Vadim Girlin wrote:
>>>>>
>>>>> Currently it's hardcoded in the shader, so every change requires
>>>>> compilation of the shader variant, killing the performance
>>>>> in Serious Sam 3 and probably other apps.
>>>>>
>>>>> This patch passes alpha_ref in the user sgpr and removes it from
>>>>> the shader key.
>>>>>
>>>>> Signed-off-by: Vadim Girlin <vadimgirlin at gmail.com>
>>>>
>>>>
>>>> Reviewed-by: Michel Dänzer <michel.daenzer at amd.com>
>>>>
>>>> I presume this causes no regressions with piglit quick.tests.
>>>
>>>
>>> Yes, there are no regressions with piglit. Thanks for reviewing.
>>>
>>> By the way, I'm also not sure if this is the right way of doing it,
>>> especially if we'll need to pass more parameters for any new features.
>>>
>>> Possibly some other ways could be more preferable, e.g. to put it with
>>> any other data that we may need in the future into internal const
>>> buffer (like we do in r600g for clip planes etc), or maybe there are
>>> other ways on SI that I'm not aware of yet?
>>
>>
>> That strongly depends on how often we use a parameter. The docs speak of
>> a penalty associated with loading each SGPR so we should try to use as
>> less as possible, but loading something from constant space is also
>> costly without proper support for the constant IB.
>
>
> By the way, AFAICS some SGPR inputs are often not used at all in the
> shaders, I guess we might want to use per-shader mapping of parameters to
> SGPRs so that we'll only load actually used values for each shader. Compiler
> will need to pack required input SGPRs to lowest indices and provide the
> parameter->SGPR map to the driver. OTOH this would slightly increase the
> amount of driver's work, so I'm not really sure yet if it's worth it, looks
> like we already have a pretty significant overhead.

I think this an interesting optimization. Most of our overhead is
elsewhere anyway. The only overhead added here would only be in the
bind_xx_shader function, because all the shader user-data registers
would have to be updated. Note that some of the input SGPRs are
already declared conditionally, e.g. the SGPRs containing streamout
state.

Marek