[Mesa-dev] [PATCH 1/5] gallium: add SQRT shader opcode
Roland Scheidegger
sroland at vmware.com
Fri Feb 1 13:38:42 PST 2013
Am 01.02.2013 19:44, schrieb Christoph Bumiller:
> On 01.02.2013 19:29, Brian Paul wrote:
>> The glsl-to-tgsi translater will emit SQRT to implement GLSL's sqrt()
>> and distance() functions if the PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED
>> query says it's supported by the driver.
>>
>> Otherwise, sqrt(x) is implemented with x*rsq(x). The problem with
>> this is sqrt(0) must be handled specially because rsq(0) might be
>> Inf/NaN/undefined (and then 0*rsq(0) is Inf/Nan/undefined). In the
> That's why we do rcp(rsq(x)), that works correctly.
Yeah though some drivers don't have a good rcp implementation (llvmpipe,
there's a fast sse2 rcp instruction but it's precision isn't sufficient
- you could add Newton-Raphson step but this gets Infs and NaNs wrong
too hence you need select workaround - so use division but that is slow...).
> I'm not sure we really need a cap for this though ... except to avoid
> modifying drivers ;)
>
> I'll advertise the cap anyway, I prefer to be able to handle it internally.
> But I like this change, lowering SQRT (or not) is device specific and
> shouldn't be done unconditionally just because the API can't represent it.
Agreed.
>
>> glsl-to-tgsi code we use an extra CMP to check if x is zero and then
>> replace the result of x*rsq(x) with zero.
>>
>> In the end, this makes sqrt() generate much more reasonable code for
>> drivers that can do square roots.
>>
>> Note that many of piglit's generated shader tests use the GLSL
>> distance() function.
>> ---
>> src/gallium/docs/source/tgsi.rst | 9 +++++++++
>> src/gallium/include/pipe/p_defines.h | 3 ++-
>> src/gallium/include/pipe/p_shader_tokens.h | 2 +-
>> 3 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
>> index 548a9a3..5f03f32 100644
>> --- a/src/gallium/docs/source/tgsi.rst
>> +++ b/src/gallium/docs/source/tgsi.rst
>> @@ -89,6 +89,15 @@ This instruction replicates its result.
>> dst = \frac{1}{\sqrt{|src.x|}}
>>
>>
>> +.. opcode:: SQRT - Square Root
>> +
>> +This instruction replicates its result.
>> +
>> +.. math::
>> +
>> + dst = {\sqrt{src.x}}
>> +
>> +
>> .. opcode:: EXP - Approximate Exponential Base 2
>>
>> .. math::
>> diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h
>> index d0db5e4..fdf6e7f 100644
>> --- a/src/gallium/include/pipe/p_defines.h
>> +++ b/src/gallium/include/pipe/p_defines.h
>> @@ -542,7 +542,8 @@ enum pipe_shader_cap
>> PIPE_SHADER_CAP_SUBROUTINES = 16, /* BGNSUB, ENDSUB, CAL, RET */
>> PIPE_SHADER_CAP_INTEGERS = 17,
>> PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS = 18,
>> - PIPE_SHADER_CAP_PREFERRED_IR = 19
>> + PIPE_SHADER_CAP_PREFERRED_IR = 19,
>> + PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED = 20
>> };
>>
>> /**
>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h
>> index 3fb12fb..a9fb6aa 100644
>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>> @@ -275,7 +275,7 @@ struct tgsi_property_data {
>> #define TGSI_OPCODE_SUB 17
>> #define TGSI_OPCODE_LRP 18
>> #define TGSI_OPCODE_CND 19
>> - /* gap */
>> +#define TGSI_OPCODE_SQRT 20
>> #define TGSI_OPCODE_DP2A 21
>> /* gap */
>> #define TGSI_OPCODE_FRC 24
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
More information about the mesa-dev
mailing list