[Mesa-dev] [PATCH 1/5] gallium: add SQRT shader opcode

Roland Scheidegger sroland at vmware.com
Fri Feb 1 13:38:42 PST 2013


Am 01.02.2013 19:44, schrieb Christoph Bumiller:
> On 01.02.2013 19:29, Brian Paul wrote:
>> The glsl-to-tgsi translater will emit SQRT to implement GLSL's sqrt()
>> and distance() functions if the PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED
>> query says it's supported by the driver.
>>
>> Otherwise, sqrt(x) is implemented with x*rsq(x).  The problem with
>> this is sqrt(0) must be handled specially because rsq(0) might be
>> Inf/NaN/undefined (and then 0*rsq(0) is Inf/Nan/undefined).  In the
> That's why we do rcp(rsq(x)), that works correctly.
Yeah though some drivers don't have a good rcp implementation (llvmpipe,
there's a fast sse2 rcp instruction but it's precision isn't sufficient
- you could add Newton-Raphson step but this gets Infs and NaNs wrong
too hence you need select workaround - so use division but that is slow...).

> I'm not sure we really need a cap for this though ... except to avoid
> modifying drivers ;)
> 
> I'll advertise the cap anyway, I prefer to be able to handle it internally.
> But I like this change, lowering SQRT (or not) is device specific and
> shouldn't be done unconditionally just because the API can't represent it.
Agreed.


> 
>> glsl-to-tgsi code we use an extra CMP to check if x is zero and then
>> replace the result of x*rsq(x) with zero.
>>
>> In the end, this makes sqrt() generate much more reasonable code for
>> drivers that can do square roots.
>>
>> Note that many of piglit's generated shader tests use the GLSL
>> distance() function.
>> ---
>>  src/gallium/docs/source/tgsi.rst           |    9 +++++++++
>>  src/gallium/include/pipe/p_defines.h       |    3 ++-
>>  src/gallium/include/pipe/p_shader_tokens.h |    2 +-
>>  3 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
>> index 548a9a3..5f03f32 100644
>> --- a/src/gallium/docs/source/tgsi.rst
>> +++ b/src/gallium/docs/source/tgsi.rst
>> @@ -89,6 +89,15 @@ This instruction replicates its result.
>>    dst = \frac{1}{\sqrt{|src.x|}}
>>  
>>  
>> +.. opcode:: SQRT - Square Root
>> +
>> +This instruction replicates its result.
>> +
>> +.. math::
>> +
>> +  dst = {\sqrt{src.x}}
>> +
>> +
>>  .. opcode:: EXP - Approximate Exponential Base 2
>>  
>>  .. math::
>> diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h
>> index d0db5e4..fdf6e7f 100644
>> --- a/src/gallium/include/pipe/p_defines.h
>> +++ b/src/gallium/include/pipe/p_defines.h
>> @@ -542,7 +542,8 @@ enum pipe_shader_cap
>>     PIPE_SHADER_CAP_SUBROUTINES = 16, /* BGNSUB, ENDSUB, CAL, RET */
>>     PIPE_SHADER_CAP_INTEGERS = 17,
>>     PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS = 18,
>> -   PIPE_SHADER_CAP_PREFERRED_IR = 19
>> +   PIPE_SHADER_CAP_PREFERRED_IR = 19,
>> +   PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED = 20
>>  };
>>  
>>  /**
>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h
>> index 3fb12fb..a9fb6aa 100644
>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>> @@ -275,7 +275,7 @@ struct tgsi_property_data {
>>  #define TGSI_OPCODE_SUB                 17
>>  #define TGSI_OPCODE_LRP                 18
>>  #define TGSI_OPCODE_CND                 19
>> -                                /* gap */
>> +#define TGSI_OPCODE_SQRT                20
>>  #define TGSI_OPCODE_DP2A                21
>>                                  /* gap */
>>  #define TGSI_OPCODE_FRC                 24
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 


More information about the mesa-dev mailing list